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Human-Machine  Interaction  Considerations  for 
Interactive  Software 


Abstract:  This  document  introduces  current  concepts  and  techniques 
reievant  to  the  design  and  impiementation  of  user  interfaces.  A  user 
interface  refers  to  those  aspects  of  a  system  that  the  user  refers  to, 
perceives,  knows  and  understands.  A  user  interface  is  impiemented  by 
code  that  mediates  between  a  user  and  a  system.  This  document  covers 
both  aspects. 


1.  Introduction 


This  document  introduces  current  concepts  and  techniques  reievant  to  the  design  and 
impiementation  of  user  interfaces.  A  user  interface  refers  to  those  aspects  of  a  system 
that  the  user  refers  to,  perceives,  knows  and  understands.  A  user  interface  is 
impiemented  by  code  that  mediates  between  a  user  and  a  system.  This  document 
covers  both  aspects. 

The  first  chapter  is  an  introduction  to  the  psychoiogy  of  human-computer  interaction,  it 
presents  the  theoreticai  modeis  that  have  had  a  significant  impact  on  the  evoiution  of 
the  fieid.  These  modeis  offer  a  way  to  organize  the  design  process  and  heip 
understand  the  cognitive  processes  invoived  in  interacting  with  a  computer. 

The  rest  of  the  document  is  concerned  with  the  software  design  of  user  interfaces  and 
shows  how  the  principies  estabiished  by  the  cognitive  pn'ncipies  can  be  put  into 
practice.  Foiiowing  a  presentation  on  the  abstractions  involved  in  the  organization  of 
an  interactive  system,  attention  is  then  directed  to  the  tools  for  constructing  user 
interfaces:  windowing  systems,  toolkits  and  user  interface  management  systems. 
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2.  Models  and  Design  Guidelines 

Human-computer  Interaction  is  extensively  cognitive.  Even  the  most  routine  of 
activities,  such  as  text  editing,  involves  problem  solving,  requires  the  formulation  of 
sequence  of  commands  and  implies  the  communication  of  these  commands  to  the 
computer.  To  match  the  user's  tasks,  designers  must  go  beyond  their  intuitive 
judgments  and  expioit  ideas  from  cognitive  psychology  and  human  factors.  These 
ideas  may  be  classified  Into  three  categories; 

•  Theoretical  models 

•  Practical  guidelines 

•  Test  strategies 

The  tutorial  concentrates  on  some  of  the  significant  theories  such  as  the  Model  of 
Human  Processor  [Card  83],  GOMS  [Card  83],  the  theory  of  Action  [Norman  86]  and  the 
theory  of  Knowledge  [Shneiderman  87];  It  also  briefly  presents  some  practical 
guidelines  based  on  these  theories,  on  the  Command  Language  Grammar  [Moran  81] 
in  particular.  [Shneiderman  87]  can  be  consulted  for  detailed  comments  on  test 
strategies. 


2.1.  Models  from  Cognitive  Psychology 

2.1.1.  Overview  of  the  Human  Processor  Model 

The  Human  Processor  Model  represents  an  individual  as  an  information  processing 
system.  This  system  Is  comprised  of  three  Interdependent  subsystems  and  operates 
according  to  a  set  of  principles.  As  Figure  2.1  shows,  the  subsystems  Include 
perceptual,  motor  and  cognitive  systems.  Each  one  Is  comprised  of  a  processor  and  a 
memory.  Processors  and  memories  are  characterized  by  parameters: 

•  z,  the  processor  cycle. 

•  m,  the  storage  capacity  In  items. 

•  d,  the  decay  time  of  an  Item,  the  time  after  which  the  probability  of 
retrieving  the  Item  Is  less  than  50%. 

•  k,  the  type  of  item  held  in  memory  (e.g.,  symbolic,  physical). 
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Figure  2.1 :  The  subsystems  of  the  human  processor. 


The  general  principles  of  operations  that  Card,  Moran  and  Newell  proposed  include: 

•  The  Encoding  Specificity  Principle:  "Specific  encoding  operations 
performed  on  what  is  perceived  determine  what  is  stored,  and  what  is 
stored  determines  what  retrievai  clues  are  effective  in  providing  access 
to  what  is  stored."  [Card  83,  p.  27] 

•  The  Discrimination  Principle:  "The  difficulty  of  memory  retrieval  is 
determined  by  the  candidates  that  exist  in  the  memory  relative  to  the 
retrieval  clues."  [Card  83,  p.  27] 

•  The  Rationality  Principle:  "A  person  acts  so  as  to  attain  his  goal  through 
rational  action,  given  the  structure  of  the  task  and  his  inputs  of 
information  and  bounded  by  limitations  on  his  knowledge  and 
processing  ability."  [Card  83,  p.  27] 
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•  The  Problem  Space  Principle:  "The  rational  activity  in  which  people 
engage  to  solve  a  problem  can  be  described  in  terms  of  (1)  a  set  of 
states  of  knowledge,  (2)  operators  for  changing  one  state  into  another. 

(3)  constraints  on  applying  operators,  and  (4)  control  knowledge  for 
deciding  which  operator  to  apply  next."  [Card  83.  p.  27]. 

•  The  last  two  prindples  have  served  as  a  basis  for  the  model  presented 
in  Section  2.2. 

The  following  subsections  describe  the  usefulness  of  the  model  from  the  point  of  view 
of  the  computer  scientist. 

2.1.2  The  Perceptual  System 

The  perceptual  system  consists  of  a  set  of  subsystems,  each  one  specialized  in  the 
processing  of  a  particular  class  of  stimuli.  A  stimulus  is  a  physical  phenomenon  that 
can  be  detected  by  a  perceptual  subsystem.  A  perceptual  subsystem  includes  a 
processor,  sensors  and  memory  buffers  called  the  visual  image  store  (for  the  visual 
subsystem)  and  the  auditory  image  store  (for  the  auditory  subsystem). 

The  visual  image  store  holds  the  output  of  the  visual  sensory  subsystem.  It  contains 
the  physical  representation  of  some  stimuli,  i.e.,  a  coding  that  characterizes  the 
physical  properties  of  the  stimuli.  For  example,  in  the  visual  image  store  represented  in 
Figure  2.2,  ^e  coding  of  the  character  P  expresses  some  shape  and  size  but  does  not 
express  the  recognition  of  the  character.  Recognition  Is  performed  by  the  cognitive 
system  described  in  Section  2.1.4. 

A  stimulus  which  impinges  upon  the  retina  at  time  x,  is  available  in  the  visual  store  at 

time  x+xs.  where  xs  Is  the  cycle  of  the  visual  processor.  The  mean  cycle  of  the  visual 
processor  is  around  100  msec  and  varies  with  the  intensity  of  the  stimuli.  This  means 
that  an  individuai  generaiiy  needs  100  msec  before  having  the  feeling  of  perceiving.  In 
other  words,  two  Images  produced  In  the  same  cycle  are  perceived  as  a  single  one. 
This  result  means  that  refreshing  the  screen  will  appear  instantaneous  to  the  user  if  the 
Image  can  be  produced  in  less  than  100  msec.  Satisfying  the  100  msec  constraint 
relies  heavily  on  hardware  technology  and  has  impact  in  software  construction.  An 
example  is  the  vyork  of  Uebbing  [Uebbing  86]  in  analyzing  the  objects  in  object- 
oriented  languages.  One  drawback  of  object-oriented  languages  is  the  overhead  due 
to  message  passing.  Uebbing  comments  on  an  interesting  experiment  about  code 
optimization.  He  shows  how  to  reorganize  objects  and  minimize  message  passing 
times.  Knowing  : 

1.  xm.  the  transfer  time  of  a  message  between  two  objects  (e.g.  0,04  msec 

for  Objective-C  on  a  MC68010). 

2.  n.  the  number  of  elementary  objects  comprised  in  a  compound  object. 

then,  the  total  time  x  spent  in  message  passing  to  redraw  the  compound  object  is  x  » 
ntm-  If  'c  is  greater  than  the  threshold  which  Is  a  function  of  the  visual  processor  cycle  xs, 
then  it  is  desirable  to: 

•  Minimize  message  passing  by  reorganizing  the  compound  ot^ect. 
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Draw  part  or  all  of  the  compound  object  with  low-level  tools  (even 
assembly  language  if  this  turns  out  to  be  necessary). 


message  passing  is  the  notion  of  windowing  service  through  local  area  networks.  This 
technique  will  be  subsequently  developed  in  Sections  3  and  4.  X-Windows  [Scheifler 
86],  which  is  such  a  server,  is  able  to  handle  mouse  events  fast  enough  to  make 
immediate  feedback  possible  without  making  the  user  aware  of  the  network. 

2.1.3.  The  Motor  System 

Shortly  after  information  has  reached  a  perceptual  memory,  the  cognitive  system 
receives  symbolically  coded  information  in  its  working  memory.  The  cognitive  system 
uses  previously  stored  information  in  the  long  -erm  memory  to  make  decisions  about 
how  to  respond:  the  model  views  thought  as  translated  into  actions  by  activating 
muscle  movements.  The  Motor  System  Is  responsible  for  movements.  Movements  that 
are  of  interest  for  human-computer  interaction  include  arm-hand  and  eye-head 
gestures. 

A  movement  is  made  of  a  sequence  of  discrete  micromovements.  Each 
micromovement  requires  one  cycle  Tm  of  the  motor  system.  The  mean  value  for  xm  has 
been  evaluated  to  70  msec.  With  the  hypothesis  that  a  movement  results  from  a 
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sequence  of  micromovements,  it  is  possibie  to  compute  the  theoreticai  time  to  move  the 
hand  to  a  given  target.  Figure  2.3  shows  the  initiai  situation:  the  hand  is  iocated  in  Xq, 
at  a  distance  D  from  the  target  (Xq  =  D).  The  size  of  the  target  is  S.  After  the  first 
micromovement,  the  hand  is  in  Xi,  then  in  X2,  etc.  One  can  show  that  the  time  T 
required  to  place  the  hand  on  a  target  depends  on  the  required  relative  precision,  that 
is  on  the  ratio  between  the  distance  and  the  size  of  the  target: 

T  =  I  log  2(D/S  +  0.5) 

where  I  Is  a  constant  determined  experimentally  (around  100  msec).  This  equation  is 
known  as  Fitts's  law. 


Fitts’s  law  can  be  usefully  applied  to  determine  the  time  spent  in  hand  homing  between 
Input  devices  or  In  object  selection  on  the  screen.  Such  computations  can  serve  as  a 
quantitative  evaluation  of  alternatives  between  syntaxes. 

2.1.4.  The  Cognitive  System 

There  are  two  important  memories  in  the  cognitive  system:  the  working  memory  and 
the  long-term  memory  (see  Figure  2.4).  The  working  memory  (also  called  short-term 
memory)  holds  information  under  current  consideration  just  like  the  general  registers  of 
a  computer.  It  contains  the  Intermediate  product  of  thinking,  the  representations 
produced  by  the  perceptual  system,  and  a  subset  of  activated  items  extracted  from  the 
long-term  memory.  The  long-term  memory  stores  knowledge  for  future  use  in  the  form 
of  symbols,  called  chunks. 

A  chunk  is  a  cognitive  unit  whose  nature  depends  on  the  user.  For  example,  SNCF  is 
made  of  four  chunks  (I.e.  the  four  letters  S,  N,  C  and  F)  for  someone  who  does  not 
know  that  SNCF  Is  the  acronym  for  the  French  train  company,  whereas  it  is  a  single 
chunk  for  French  people.  Chunks  can  be  organized  into  larger  units  and  be  related  to 
other  chunks.  For  example,  the  chunk  "car"  Is  composed  of  the  chunks  "wheel,"  "body," 
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etc.,  and  the  chunk  "weather"  is  related  to  the  chunks  "sun,"  "rain,"  "cloud."  Semantic 
networks  have  been  widely  used  to  represent  such  relationships  between  pieces  of 
knowledge. 

When  a  chunk  is  activated,  previously  activated  chunks  are  less  available  because  of 
limited  capadty  of  the  working  memory.  The  new  chunks  interfere  with  the  other  ones 
which  tend  to  disappear  from  the  working  memory  if  they  are  not  reactivated.  Note  that 
the  working  memoiy  behaves  like  the  working  set  of  virtual  memory  paging  systems: 
when  a  page  fault  occurs  (I.e.,  when  a  chunk  Is  activated),  pages  in  the  main  memory 
that  have  not  been  used  (I.e.,  chunks  that  have  not  been  reactivated)  are  swapped  out 
to  let  the  last  referenced  page  be  Installed  in  the  main  memory.  The  Room  model 
presented  in  Section  4.7  illustrates  this  notion  of  "cognitive  working  sets"  by  organizing 
the  task  space  of  the  user  In  closely  related  windows. 

The  capacity  of  the  long-term  memory  is  infinite:  there  is  no  erasure  from  the  long-term 
memory,  but  retrieval  of  a  chunk  m^  fall.  This  failure  may  have  several  causes:  no 
association  can  be  found  or  slmllsir  association  to  several  chunks  interfere  with  the 
retrieval  of  the  target  chunk.  As  a  consequence,  the  best  way  to  remember  something 
later  and  avoid  chunk  Interference  Is  to  associate  it  with  chunks  of  the  long-term 
memory  In  a  unique  way. 

While  the  capacity  of  the  long-term  memory  Is  infinite,  that  of  the  working  memory  is 
very  limited.  It  has  been  demonstrated  that  the  capacity  of  the  short-term  memory  is  5  ± 
2  [Miller  75].  As  a  result,  not  only  should  software  engineers  pay  attention  to  short-term 
memory  overload  but  also  should  devise  effective  electronic  extensions.  Section  2.6 
shows  that  menus  and  forms  constitute  such  appropriate  extensions. 
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Rgure  2.4:  Perceptual  memories,  short-term  memory  and  long-term  memory. 


2.1.5.  Evaluation  of  the  Human  Processor  Model 

Clearly,  the  Human  Processor  model  is  a  simplification  of  the  complex  state  of  present 
knowledge  in  cognitive  psychology.  However,  it  provides  the  computer  scientist  with  a 
comprehensible  framework  on  which  various  aspects  of  this  knowledge  can  be 
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gradually  plugged.  Actually,  the  goal  of  Card  et  al.  goes  beyond  providing  a  framework 
for  thoughts.  The  goal  Is  to  create  a  new  discipline  that  would  combine  characteristics 
from  fundamental  and  applied  sciences.  As  in  physics,  this  discipline  would  allow  the 
designer  to  perform  approximate  evaluations.  With  the  help  of  a  technical  theory 
[Newell  86],  It  would  be  possible  to  elaborate  models  that  would  allow  the  designer  to 
answer  questions  about  a  particular  phenomenon  in  human-computer  Interaction.  The 
model  of  the  Human  Processor  Is  a  step  towards  this  technical  theory.  For  doing  so,  it 
Introduces  parameters  that  help  In  formalizing  user  performance  and  making  predictive 
evaluations. 

Unfortunately,  the  parameters  of  the  model  of  the  Human  Processor  are  useful  for 
computing  low-level  behavior  only.  They  are  useful  In  determining  the  optimal  rate  for 
refreshing  the  screen;  they  stress  the  Incidence  of  size  targets  on  the  effectiveness  of 
selection  actions;  they  explain  why  special  attention  should  be  devoted  to  short-term 
memory  overload.  Although  mathematical  expressions  bring  some  scientific  coloration 
to  the  development  of  a  domain,  the  parameters  of  the  model  of  the  Human  Processor 
are  driven  purely  by  performance  considerations.  They  do  not  help  In  the 
understanding  of  the  underlying  cognitive  processes  that  lead  to  such  performance. 
The  principles  of  operation  that  accompany  the  model  are  an  attempt  in  this  direction. 
One  of  them,  the  principle  of  rationality,  serves  as  a  basis  to  goals,  operators,  methods, 
and  selection  [Card  83],  described  In  the  next  subsection. 


2.2.  Practical  Guidelines  for  Design 

GOMS: 

•  Is  based  on  the  theoretical  hypothesis  described  In  the  previous 
subsection:  a  human  being  acts  In  a  rational  manner. 

•  Is  a  model  for  the  performance  of  the  user  who  does  not  make  errors. 

•  Structures  the  cognitive  activity  involved  in  accomplishing  a  task  into 
four  components:  Goal,  Operators,  Methods,  Selection. 

A  goal  is  a  symbolic  structure  that: 

•  Defines  a  desired  state. 

•  Determines  the  set  of  methods  which  lead  to  this  goal. 

•  Constitutes  a  backtrack  point  in  case  of  failure. 

Goals  are  organized  hierarchically.  The  leaves  of  the  hierarchy  are  operators.  For 
example,  when  starting  to  edit  a  document,  the  user  has  the  top  level  goal  "edlt- 
manuscrlpt.”  The  user  segments  this  larger  task  Into  smaller  tasks  and  devises  the 
subgoals  to  achieve  the  subtasks.  Figure  2.5  gives  an  example  of  such  a  subtask, 
which  consists  of  transposing  two  words. 
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Rgure  Z5:  The  decomposition  of  a  goal  (transpose  two  words)  into  a  hierarchy  of 
subgoals.  The  leaves  of  the  tree  denote  physical  actions.  The  illustration 
does  not  make  explicit  the  selection  rules  applied  by  the  user 
_ when  several  methods  lead  to  the  same  goal. _ 


An  operator: 

•  Is  a  perceptual,  a  motor  or  a  cognitive  action. 

•  Provokes  a  change  in  the  mental  and  environmental  state. 

•  Is  characterized  by  I/O  parameters  and  an  execution  time. 

A  method: 

•  Describes  the  know-how.  The  know-how  is  made  of  learned  procedures 
that  the  user  already  has  at  execution  time.  They  are  not  plans  created 
at  execution  time.  The  learned  procedures  express  skill  built  from  prior 
experience.  They  reflect  the  knowledge  of  the  exact  sequence  of  steps 
to  accomplish  a  task 

•  Is  a  sequence  of  conditions  about  goals  and  operators. 

A  selection  rule  determines  the  choice  between  the  methods  that  achieve  the  same 
goal. 

GOMS  can  be  used  to  model  and  predict  the  user's  behavior  at  various  levels  of 
abstractions.  One  application  of  GOMS  at  a  low  level  of  abstraction  is  the 
KEYSTROKE  level  model  [Card  83]  which,  given  a  command  language,  allows  the 
designer  to  predict  the  time  needed  by  the  user  to  enter  a  command. 
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To  summarize,  GOMS: 


•  Is  useful  for  predicting  errorless  behavior. 

•  Does  not  deal  with  concurrent  operations:  the  behavior  is  assumed  to  be 
linear.  The  goal  stack  model  does  not  fit  non  linear  planning;  and  non 
linear  planning  Is  required  to  deal  with  the  user's  Interruptions  (e.g. 
errors). 

•  Is  behaviorlst:  it  is  a  model  about  performance.  It  is  not  cognitive,  as  is 
the  theory  of  action  in  the  next  section. 


2.3.  The  Theory  of  Action  and  Conceptual  Models 

One  of  the  goals  of  cognitive  engineers  Is  to  Identify  and  understand  the  principles  that 
guide  the  actions  of  the  Individual.  The  theory  of  D.  Norman  relies  on  the  hypothesis 
that  the  user  elaborates  conceptual  models  and  that  task  accomplishment  involves 
several  stages  [Norman  86]. 

2.3.1.  Conceptual  Models 

A  conceptual  model: 

•  Is  a  mental  representation  of  oneself  and  of  the  environment. 

«  Depends  on  previous  knowledge  and  understanding. 

•  Is  modified  by  the  nature  of  the  interaction. 

When  considering  the  Interaction  of  a  user  with  an  artifact,  it  Is  important  to  consider 
two  conceptual  models  (the  designer's  and  the  user's  conceptual  models)  and  the 
notion  of  system  image.  If  the  artifact  is  a  computer,  there  Is  also  the  system's  model  to 
consider.  Figure  2.6  represents  these  models. 
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designer  user 


Figure  2.6:  Conceptual  models. 


The  Designer's  Conceptual  Model: 

•  Is  the  model  that  the  designer  devises  for  the  artifact. 

•  Relies  on  the  representation  that  the  designer  has  about  the  typicai  user 
of  the  artifact.  Ideally,  this  conceptualization  is  based  on  a  thorough 
anaiysis  of  the  user's  tasks,  requirements,  capabiiities,  background  and 
experience. 

The  User’s  Conceptual  Model: 

•  Results  from  the  user's  interpretation  of  the  system  image. 

•  Defines  the  'View"  that  the  user  has  about  the  system. 

The  System  Image: 

•  Results  from  the  physical  structure  that  has  been  built  (artifact). 
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•  Should  be  explicit,  intelligible  and  consistent,  so  that  the  user  may 
elaborate  a  conceptual  model  compatible  with  the  design  model.  The 
burden  Is  placed  on  the  image  that  the  system  projects.  Accomplishing 
a  task  will  be  easier  or  harder,  depending  on  the  system  image. 

The  System's  Model: 

•  Is  the  model  that  an  intelligent  program  might  build  about  the  user. 

•  Allows  for  automatic  customization. 

2.3.2.  Toward  a  Theory  of  Action:  Stages  of  User  Activities 

Accomplishing  a  task  involves  approximately  seven  stages  (see  Figure  2.7): 

1 .  Establishing  tiie  Goal 

A  goal  is  a  mental  representation  of  the  desired  state.  It  is  expressed  in  terms 
of  psychological  variables.  The  system  state  Is  defined  by  the  value  of  its 
physical  variables,  such  as  the  location  of  the  cursor  or  a  sequence  of  words 
that  forms  a  sentence.  The  user  compares  the  system  state  to  the  goal.  To  do 
so,  the  system  state  is  translated  Into  a  psychological  representation. 

2.  Forming  the  Intention 

The  evaluation  of  the  distance  between  the  goal  and  the  translated  state  of  the 
system  gives  rise  to  an  intention.  An  Intention  is  the  decision  to  act  toward 
achieving  a  goal.  An  intention  Is  stated  In  psychological  terms.  It  specifies  the 
meaning  of  the  input  expression  that  is  to  satisfy  the  user's  goal.  To  do  so,  the 
user  must  know  the  mapping  between  the  psychological  variables  and  the 
physical  variables;  for  example,  the  user  must  have  established  the 
correspondence  between  the  notion  of  insertion  point,  which  Is  a  psychological 
variable,  and  the  location  of  the  cursor,  which  is  a  physical  variable;  As  another 
example.  In  order  to  achieve  the  goal  "delete  wordi"  In  Figure  2.5,  the  user 
must  know  the  link  between  suppressing  a  word,  which  Is  a  psychological 
notion,  and  the  command  "cut,"  which  Is  a  physical  Input  expression.  The  user 
must  know  the  effect,  the  meaning,  of  the  command  "cut." 

3.  Specifying  the  Action  Sequence 

The  intention  must  be  translated  into  a  sequence  of  actions.  To  do  so,  the  user 
has  to  know  the  mapping  between  the  physical  variables  and  the  physical 
control  mechanisms.  A  physical  control  mechanism  allows  for  the  modification 
of  physical  variables.  The  specification  of  an  action  sequence  is  a  mental 
representation  of  the  actions  to  perform  on  the  physical  control  mechanisms.  It 
prescribes  the  form  of  the  input  expression  that  has  the  desired  meaning.  For 
example,  the  user  must  know  that  the  location  of  the  cursor  can  be  modified 
with  the  mouse.  In  the  example  In  Figure  2.5,  the  user  knows  the  syntactic- 
lexical  definition  of  the  command  "cut." 
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4.  Executing  the  Action 

The  execution  of  an  action  is  the  manipuiation  of  physical  control  mechanisms. 

5.  Perceiving  the  System  State 

The  system  state  Is  embedded  In  an  output  expression.  The  perception  of  this 
expression  is  the  translation  of  the  physical  variables  Into  psychological 
variables.  For  example,  after  typing  the  character  backspace  in  Figure  2.5,  the 
user  perceives  that  the  output  expression  no  longer  contains  the  word 
displayed  in  reverse  video  in  the  previous  output  expression. 

6.  interpreting  the  System  State 

The  interpretation  of  the  output  expression  results  in  determining  the  meaning 
of  the  output  expression.  For  the  example  in  Figure  2.5,  the  disappearance  of 
the  word  is  interpreted  as  the  deletion  of  the  word. 

7.  Evaiuating  the  System  State  with  Respect  to  the  Goais 

The  evaluation  establishes  the  relationship  between  the  meaning  of  the  output 
expression  and  the  user's  mental  goal.  This  evaluation  may  result  in  a 
modification  or  in  continuing  to  the  next  step  in  the  plan. 
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Accomplishing  a  task: 

•  Does  not  necessarily  require  the  presence  of  the  seven  stages. 

•  Does  not  require  these  stages  to  happen  in  a  specific  order. 

•  Creates  different  needs  at  different  stages.  For  example,  menus  can 
assist  in  the  stage  of  forming  an  intention  and  specifying  an  action,  but 
frequently  make  execution  more  clumsy. 

•  Does  require  a  translation  between  the  psychological  representations 
and  the  physical  presentations.  This  translation  reveals  the  existence  of 
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a  gap  between  the  mental  world  and  the  physical  world.  Norman  calls 
this  discrepancy  a  "gulf." 

The  gulf  between  the  user  and  the  system  Is  two-way:  from  the  mental  representation 
to  the  physical  presentation,  and  from  the  physical  world  to  the  mental  world.  The  first 
gap  is  called  the  gulf  of  execution,  whereas  the  second  is  the  gulf  of  evaluation. 

The  gulf  of  execution  consists  of  the  semantic  distance  and  the  articulatory  distance. 
The  semantic  distance  is  covered  by  the  intention,  which  goes  from  the  goal  to  the 
specification  of  the  meaning  of  an  input  expression  that  is  to  satisfy  the  goal.  The 
articulatory  distance  Is  covered  by  the  action  specification,  which  goes  from  the 
meaning  of  the  input  expression  to  its  syntactic/lexicai  form. 

The  gulf  of  evaluation  also  consists  of  an  articulatory  distance  and  a  semantic  distance, 
covered  respectively  by  the  interpretation  of  the  output  expression  and  the  evaiuation 
of  the  meaning  of  the  output  expression. 

in  summary,  this  theory  stresses  the  fact  that  the  accomplishing  of  a  task  involves 
several  stages,  that  each  stage  has  Its  own  possibly  conflicting  needs,  that  these  needs 
result  from  the  gulf  between  the  mental  representation  and  the  physical  presentation, 
and  that  this  gulf  should  be  bridged  by  the  system  designer  as  much  as  possible 
through  the  system  Image.  Conversely,  if  the  matches  between  the  psychological  and 
the  physical  variables  are  weak,  the  user  has  to  bridge  the  gulfs  by  creating  more 
plans,  more  action  sequences  and  more  Interpretations  that  move  the  psychological 
description  closer  to  the  physical  requirements. 

Opposite  GOMS,  which  provides  the  designer  with  a  synthetic  view  of  human  behavior, 
Norman's  theory  of  action  analyzes  the  mental  processes  that  lead  to  such  behavior. 
Whereas  GOMS  is  limited  to  the  ideal  case  of  errorless  Interaction,  Norman  stresses 
the  difficulties  encountered  by  the  user  and  provides  the  designer  with  a  general 
framework  for  explaining  the  cause  of  errors.  GOMS  is  a  quantitative  model  about 
human  performance,  whereas  Norman's  theory  of  action  is  an  informal,  explanatory, 
cognitive  model  about  human  behavior.  The  informal  nature  of  Norman's  theory 
prevents  the  designer  from  making  predictive  evaluations.  However,  such  a  theory  can 
serve  as  a  basis  for  the  development  of  evaluation  techniques  (e.g.,  ETIT  [Moran  83]). 
The  Intuitive  view  of  Norman's  theory  Is  Interestingly  complemented  by  ACT* 
[Anderson  83],  a  formal  theory  of  human  cognition  based  on  production  systems. 


2.4.  Theory  of  Knowledge:  The  Semantic/Syntactic 
Model  of  Knowledge 


The  nature  of  knowledge  has  been  studied  extensively,  resulting  in  various  theories 
about  how  knowledge  is  organized  and  exploited.  This  section,  first  describes  briefly  a 
general  theory  of  knowledge,  as  well  as  the  semantic/syntactic  model  of  knowledge, 
useful  in  the  context  of  user  Interface  design. 

2.4.1  A  General  Theory  [Simon  84,  Card  83] 

Subsection  2.1.4  explains  that  knowledge  Is  organized  as  a  network  of  chunks.  This 
network  contains  two  classes  of  information: 
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•  Factual  knowledge:  a  set  of  assertions,  predicates,  known  facts  with 
possibly  confidence  factors 

•  Procedural  knowledge:  a  set  of  procedures  that  describe  the  know-how. 

A  procedure  Is  an  elementary  action,  such  as  a  computer  Instruction. 
Unlike  the  computer  instruction  set,  the  procedure  set  of  the  cognitive 
processor  evolves  with  time. 

In  the  context  of  human-machine  interaction,  the  chunks  of  interest  here  are  those  that 
constitute  the  user's  conceptual  model.  This  conceptual  model  contains  facts  and 
know-how  about  the  system.  Today,  it  Is  widely  agreed  that  these  facts  and  skill  can  be 
classified  into  two  categories:  syntactic  knowledge  and  semantic  knowledge  (see 
Figure  2.8)  [Shneiderman  87]. 


The  User's  Model 


Semantic  Knowledge 


Syntactic  Knowledge 


Figure  2.8:  The  Syntactic/Semantic  model  of  knowledge. 


2.4.2.  Syntactic/Semantic  Knowledge 

Syntactic  knowledge: 

•  Represents  the  linguistic  conventions  that  the  user  must  know  to  specify 
requests  to  the  system  (Input  expressions)  or  to  interpret  responses  from 
the  system  (output  expressions).  These  conventions  allow  the  user  to 
communicate  with  the  system  image. 

•  Is  system  dependent. 

•  Is  arbitrary,  inconsistent,  difficult  to  retrieve  and  has  many  other  negative 
quaiities. 

•  Must  be  acquired  by  rote  memorization  and  repetition. 
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Semantic  knowledge  is: 

•  An  organized  hierarchy  of  factual  and  procedural  concepts:  factual 
concepts  are  in  the  form  of  objects  or  data.  Procedural  concepts  are 
operations  on  objects  or  procedures  on  data.  In  addition,  a  distinction 
should  be  made  between  domain-dependent  objects  and  operations, 
and  system-dependent  objects  and  operations. 

•  Potentially  transferable  across  different  computer  systems. 

•  Independent  of  syntactic  details. 

•  Acquired  by  meaningful  learning. 

The  distinction  between  syntax  and  semantics,  and  between  domain-dependent 
concepts  and  system-dependent  concepts,  match  the  usual  forms  of  competence:  a 
user  may  be  Incompetent  in  a  domain  but  skillful  at  using  a  particular  computer. 
Conversely,  the  user  may  be  knowledgeable  in  a  field,  but  ignorant  in  the  use  of  a 
particular  computer  system. 


2.5  Theoretical  Models:  Summary 

Models  presented  so  far  are  concerned  with  phenomena  related  to  human-computer 
interaction. 

•  Some  modeis,  such  as  the  Human  Processor  Model,  GOMS  and 
Keystroke,  are  useful  for  making  quantitative  predictions  about  a 
particular  design.  However,  by  oversimplifying  the  real  world,  they  are 
too  limited  in  scope  and  too  iow  ievei. 

•  Other  models,  such  as  Norman’s  Theory  of  Action  and  Shneiderman's 
model  of  Syntactic/Semantic  Knowledge,  provide  the  designer  with 
explanations  about  the  cognitive  behavior  of  the  user.  Aithough  they 
take  a  more  realistic  view  of  the  real  world,  these  models  lack  of  a 
scientific  formalism  makes  them  unusable  as  predictive  tools. 

The  user  Interface  designer  has  the  difficult  task  of  integrating  these  various  theories 
into  a  unique  "easy-to-whatever"  computer  system!  Combining  all  of  these  principles 
leads  directly  to  some  kind  of  combinatory  explosion.  Combinatory  explosion  may  be 
avoided  with  the  use  of  heuristics.  Heuristics  does  not  guarantee  an  optimal  solution, 
but  it  provides  a  reasonable  answer.  The  following  section  we  introduces  some  general 
heuristics  that  needs  to  be  flavored  with  the  peculiarities  of  the  specific  case  at  hand. 


2.6.  Practical  Guidelines:  Methods  and  Golden  Rules 

The  general  method  presented  in  this  section  is  an  application  of  the  Command 
Language  Grammar  [Moran  81],  although  the  Command  Language  Grammar  (CLG)  is 
not  a  methodology.  CLG  conveys  a  type  of  top-down  approach  that  can  be  found 
useful  as  a  framework  for  designing  user  interfaces.  CLG  is  a  grammatical  structure  to 
represent  computer  systems  at  various  levels  of  abstractions.  Each  level  of 
representation  defines  a  particular  view  of  the  system,  and  each  view  results  from  an 
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analysis  that  any  competent  designer  should  perform.  Figure  2.9  illustrates  the 
principles  of  CLG,  whose  terminology  is  explained  in  Section  2.6.1. 

2.6.1.  General  Method  for  User  Interface  Design 

The  design  of  a  particular  Interactive  system  may  be  structured  along  five  axes: 

1.  Definition  of  the  profile  of  the  user  based  on  a  general  classification 
(notion  of  novice,  expert,  and  occasional  user,  combined  with  the  notion 
of  semantic  and  syntactic  knowledge). 

2.  Definition  of  the  profile  of  the  tasks:  utility  of  the  system  according  to  the 
needs  of  the  user.  This  constitutes  the  task  level  of  CLG.  It  consists  of 
defining  the  domain-dependent  entitles  as  perceived  by  the  user: 

•  the  task  entities  of  the  domain. 

•  the  tasks  to  be  perfomied  in  the  domain. 

•  the  decomposition  of  the  tasks  Into  a  hierarchy  of  subtasks. 

•  the  task  procedures  (methods)  to  perform  the  various  tasks. 

•  the  privileged  tasks,  i.e.,  tasks  that  need  special  attention  due, 
perhaps,  to  their  frequency. 

3.  Definition  of  system-dependent  notions  to  implement  the  domain- 
dependent  concepts.  This  constitutes  the  semantic  level  of  CLG.  It 
includes: 

•  the  conceptual  entitles,  which  act  as  the  electronic 
representations  of  the  conceptual  objects  and  of  the  additional 
entities  that  the  system  uses  for  its  own  purposes. 

•  the  user  and  system  conceptual  operations  to  manipulate  the 
conceptual  entities  (looking  for  an  information  on  the  screen  is 
considered  a  user  conceptual  operation). 

•  the  semantic  procedures  (methods)  expressed  in  terms  of  the 
user,  and  system  conceptual  operations  to  perform  the  tasks 
defined  in  the  task  level. 

4.  The  definition  of  the  structure  of  the  dialogue  In  layers  of  increasing 
complexity  and  leading  to  task  closure.  This  is  the  syntactic  level  of 
CLG.  It  Includes: 

•  the  commands  and  their  arguments. 

•  the  clustering  of  commands  into  contexts  and  the  mechanisms 
for  switching  between  contexts. 

•  the  syntactic  procedures  (methods)  expressed  in  terms  of  the 
commands,  as  well  as  in  terms  of  the  conceptual  operations  of 
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the  user.  A  syntactic  procedure  shows  how  to  perform  a  task 
defined  at  the  task  ievei. 

5.  Definition  of  the  interaction  styie,  choice  of  the  iexicai  detaiis  (e.g.  worid 
metaphor  vs.  conversationai  metaphor).  This  is  the  interaction  ievei  of 
CLG.  it  includes; 

•  the  interaction  elements  and  the  primitive  actions  performed  by 
the  user  and  by  the  system  (keystroke  and  mouse  selection  are 
examples  of  user  actions;  prompts  and  responses  are  system 
primitive  actions). 

•  the  order  in  which  the  interaction  elements  must  be  specified  by 
the  user  or  produced  by  the  system. 

•  the  interaction  procedures  (methods)  expressed  in  terms  of  the 
primitive  actions  and  in  terms  of  the  conceptual  operations  of  the 
user.  An  interaction  procedure  shows  how  to  perform  a  task 
defined  at  the  task  level. 
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Entities  specific 
to  the  level 


Relation  of  correspondence 

Used  to  express 


Rgure  2.9;  The  CLG  layers  for  the  Design  of  User  interface. 


As  the  description  of  CLG  shows,  a  particular  system  is  fully  described  at  various  levels 
of  abstraction.  Each  level  manipulates  its  own  entities  and  operators,  but  these 
elements  are  combined  to  fully  describe  the  system.  Each  level  can  be  viewed  as  a 
refinement  of  the  previous  one  (i.e.,  higher  in  the  hierarchy)  and  each  level  is 
independent  of  the  following  one  (i.e.,  lower  in  the  hierarchy).  By  following  this 
hierarchical  method,  CLG  yields  a  top-down  approach  to  the  design  of  a  user  interface. 
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Although  CLG  can  be  usefully  exploited  as  a  method,  the  basic  difficulty  for  the 
designer  is  defining  and  structuring  the  user's  tasks.  If  the  description  of  the  task 
domain  does  not  match  the  mental  representation  and  the  cognitive  processes  of  the 
user,  the  system  will  probably  be  hard  to  use  and  hard  to  learn.  Unfortunately,  an 
appropriate  organization  of  the  user's  tasks  requires  an  intensive  knowledge  in  the 
domain  of  cognitive  psychology,  a  knowledge  that  most  computer  scientists  do  not 
master. 

The  subsections  that  follow  provide  the  designer  with  practical  guidelines  that  may  be 
useful  to  define  the  syntactic  and  Interaction  levels  of  CLG. 

2.6.2.  Guidelines 

The  guidelines  presented  in  this  subsection  form  a  very  small  fraction  of  hundreds  of 
rules  currently  available  In  the  literature.  For  a  more  complete  enumeration,  refer  to 
[Scapin  87,  Shneiderman  87].  The  guidelines  that  follow  are  a  selection  of  general 
human  factor  principles  that  computer  scientists  may  apply  easily.  They  are  organized 
as  a  set  of  seven  guidelines:  consistency,  concision,  cognitive  load  reduction,  user- 
driven  Interaction,  flexibility,  dialogue  structuring,  and  error  prediction. 

2.6.2.1.  Guideline  1:  Consistency 

Consistency  Implies  the  absence  of  exception.  Exceptions  Increase  learning  time  and 
the  likelihood  of  error.  System  consistency  Is  a  concern  at  all  of  the  stages  that  D. 
Norman  identified  for  modeling  human-computer  interaction.  This  subsection  Is  limited 
to  the  stage  of  action  specification  and  to  the  execution  stage.  Rules  for  the  perception 
and  the  evaluation  stages  derive  directly  from  those  considered  here. 

•  Consistency  and  the  Action  Specification  Stage 

If  a  goal  is  similar  in  different  environments,  then  the  sequence  of 
actions  to  accomplish  the  goal  should  be  the  same. 

For  example,  a  user  needs  to  "duplicate  an  object  and  print  the  copy  of 
the  object".  The  object  may  be  a  document  or  an  electronic  mail 
message.  In  both  environments,  the  mail  system  and  the  document 
preparation  system,  the  sequence  of  actions  should  be  the  same. 

•  Consistency  and  the  Execution  Stage 

The  execution  stage  includes  syntactic,  lexical  and  pragmatic  issues. 

•  With  regard  to  syntax,  the  designer  should  determine  the  order  of 
command  arguments.  Experiments  Indicate  that  when 
commands  share  arguments,  these  arguments  should  appear  in 
the  same  order  in  every  command. 

•  Note  that  the  order  does  not  always  match  the  sequencing  of 
natural  languages  and  that  there  Is  a  choice  between  postfixed 
notation  and  prefixed  notation.  It  seems  that  for  graphical 
environments,  a  postfixed  notation  is  more  appropriate  whereas 
the  prefixed  notation  is  adequate  for  text-based  interaction. 
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•  With  regard  to  lexical  issues,  naming  should  be  consistent  .  f 
some  function  appears  in  different  contexts,  it  should  be 
designated  with  the  same  name. 

A  counter  example  of  this  rule  is  the  function  "terminate"  in  the 
Unix  world:  to  terminate  a  message  in  the  mail  system,  the  user 
must  enter  a  single  character  line  (character to  terminate  the 
mail  system,  the  user  must  type  "q"  (or  "x"  depending  on  how  the  . 
user  wants  to  reenter  the  mail  system);  typing  "logout"  terminates 
a  Unix  session. 

•  With  regard  to  pragmatics  issues,  consistency  recommends  that 
spatial  layout  of  output  information  should  be  preserved. 

This  principle  of  locality  helps  the  user  anticipate  gesture  on 
system  outputs,  in  particular,  menu  items  should  always  appear 
In  the  same  order,  the  order  must  primarily  depend  on  a  logical 
sequencing  defined  by  the  task;  if  the  task  does  not  show  any 
logical  order,  the  frequency  criteria  should  be  applied;  however, 
if  the  frequency  criteria  Is  not  applicable,  alphabetical  order 
should  be  used.  Similarly,  locality  rules  have  been  defined  for 
forms:  at  the  top  of  the  form,  the  user  should  find  the  fields  that 
must  be  filled  whereas  optional  items  can  be  gathered  at  the 
bottom.  Note  that  this  guideline  is  consistent  with  Fitts’s  Law:  it 
minimizes  hand  movements. 

2.6.2.2.  Guideline  2:  Conciseness 

Consiceness  is  the  harmonious  combination  of  brief  and  powerful  expressions.  In 
computer-human  interaction,  conciseness  is  achieved  in  the  form  of  abbreviations, 
macrocommands,  cut  and  paste  facilities,  undo  and  redo  features,  and  default  values. 

This  section  illustrates  the  difficulty  in  applying  these  guidelines  with  the  use  of 
judgement  by  the  designer.  For  example,  conciseness  Is  desirable  for  the  experienced 
user  but  not  for  the  novice  user.  It  is  important  to  identify  the  end  users  of  a  particular 
interface  and  tailor  the  Interface  to  their  characteristics. 

•  Conciseness  and  Abbreviations 

Abbreviations  are  usefLi  shortcuts  for  experienced  users.  Shortcuts  are 
mandatory.  For  example,  menus  are  adequate  as  a  technique  for 
minimizing  memory  load,  but  they  are  clumsy  when  considering  the 
action  specification  stage  (a  Keystroke  ievei  modei  can  be  used  to 
support  this  assertion).  However,  in  order  to  be  understandable, 
abbreviations  should  be  derivable  from  precise  rules. 

Common  rules  for  deriving  abbreviations  include: 

1.  Special  character  (e.g.,  escape  or  control)  followed  by  a  letter 
(usually  the  initial  of  the  command  name).  EMACS  is  a  good 
example  of  the  application  of  this  rule. 
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2.  Vowel  deletion.  For  example,  the  command  delete  would  be 
abbreviated  as  "dit". 

3.  Maximum  truncation  which  consists  of  suppressing  characters 
from  command  names  as  iong  as  there  is  no  ambiguity.  For 
example,  given  the  set  of  command  names  "compile,  copy, 
delete,"  the  rule  respectively  derives  "com,  cop,  dei". 

4.  Two  character  truncation.  This  rule  applied  to  the  set  "compile, 
copy,  delete,"  would  derive  "cm,  cp,  dl". 

Figure  2.10  illustrates  the  results  of  a  study  that  compares  user  performance  according 
to  the  abbreviation  rule  [John87].  The  response  time  is  the  mean  time  the  user  needs  to 
enter  an  abbreviated  command. 


RESPONSE  TIME  (msec) 

2519 


2-Char  Maximum  Special  Vowel 

Truncation  Truncation  Character  Deletion 


Figure  2:10:  Comparative  user  performance  according 
to  the  abbreviation  rule. 


•  Conciseness  and  Macrocommands 

A  macrocommand  is  to  interaction  languages  what  a  procedure  is  to 
programming  languages.  It  is  an  abstraction  mechanism  and  an 
extension  technique.  As  an  abstraction  mechanism,  it  matches  human 
learning  cognitive  processes  that  encapsulate  related  pieces  of 
knowledge  into  a  "bigger"  chunk.  As  an  extension  technique,  it  allows 
for  combining  generality  and  particularity. 
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Considering  hunnan-computer  Interaction,  particularity  denotes  user 
specific  needs.  Norman's  theory  of  action  identifies  the  semantic 
distance  between  the  formation  of  the  intention  and  the  elaboration  of  a 
plan  of  commands.  One  way  to  shorten  the  gulf  is  to  provide  the  user 
with  a  high-level  language  that  directly  expresses  the  most  frequent 
problem-solving  plans.  The  drawback  of  a  highly  tailored  language  Is 
the  difficulty  to  express  unusual  tasks. 

The  conflict  between  particularly  and  generality  has  been  solved  in  Unix 
and  Lisp-based  systems  by  providing  the  user  with  a  fairly  low-level, 
general  purpose  language  to  build  new  commands.  These  commands 
may  encapsulate  frequently  encountered  actions  into  a  single 
parametrizable  chunk.  Unfortunately,  the  user  interface  for  defining  such 
macros  forms  a  highly  disappointing  cognitive  barrier  to  the  newcomer 
or  to  the  unmotivated  user. 

•  Conciseness  and  Cut  and  Paste  Facilities 

"Cut  and  Paste"  Is  the  electronic  version  of  manual  patchwork.  As  with 
manual  patchwork,  it  offers  a  way  to  reuse  Information.  For  example,  it 
avoids  the  need  to  retype  information,  or  it  allows  the  user  to  enter 
Information  already  provided  by  the  system.  Cut  and  paste  is  also  a 
means  for  overcoming  lack  of  integration  between  tools.  For  example, 
the  user  can  develop  a  text  with  a  special  purpose  text  editor,  then  draw 
a  picture  with  a  sophisticated  Interactive  editor,  and  eventually  paste  the 
picture  into  the  text  document.  In  integrated  environments,  there  would 
be  no  need  for  the  user  to  explicitly  use  different  tools.  In  any  case,  cut 
and  paste  operations  must  appear  consistent  to  the  user. 

An  example  of  Inconsistency  Is  a  round-trip  transfer  of  information 
between  MacDraw  and  MacPaint.  MacDraw  manipulates  graphical 
objects  such  as  circles  and  polygons,  whereas  MacPaint  handles  pixels 
only.  Suppose  a  user  performs  the  following  actions:  draw  a  circle  C 
with  MacDraw,  cut  C  from  MacDraw,  paste  C  Into  MacPaint,  cut  C  from 
MacPaint  and  finally  paste  C  back  Into  MacDraw.  As  far  as  the  naive 
user  Is  concerned,  C  looks  like  a  circle  In  the  MacDraw  document,  but  is 
not  editable  anymore  as  a  circle.  Cut  and  pasted  operations  have  lost 
"semantic"  information  about  transferred  data. 

Consistency  in  the  behavior  of  "cut  and  pasted  "  information  relies  on  the 
existence  of  a  universal  format,  as  well  as  on  a  general  type  translator.. 
A  universal  format  defines  a  common  data  representation,  i.e.,  a 
common  formalism,  for  all  of  the  applications,  say,  of  a  workstation.  A 
type  translator  performs  the  required  transformations  between  the  data 
representations  specific  to  an  application  and  the  universal  format.  To 
our  knowledge,  "type  recasting"  is  a  research  topic  that  has  not  been 
investigated  in  its  full  generality. 

•  Conciseness  and  Undo  and  Redo  Features 

Undo  has  two  advantages:  it  allows  the  user  to  easily  correct  a  mistake 
and  it  avoids  the  execution  of  the  plan  of  actions  that  would  undo  the 
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desired  effect.  A  redo  feature  avoids  the  repetition  of  a  sequence  of 
actions.  Both  undo  and  redo  support  conciseness. 

*  Conciseness  and  Default  Values 

Default  values  are  another  means  to  reuse  information.  There  are  two 
kinds  of  default  values:  static  and  dynamic.  Static  values  do  not  evolve 
with  the  session.  They  are  generally  wired  in  the  system,  or  are 
acquired  at  initiation  time  from  a  profile  file.  On  the  other  hand,  dynamic 
default  values  evolve  during  the  session.  They  are  computed  by  the 
system  from  previous  user  Inputs.  Figure  2.11  gives  an  example  of  the 
default  value  proposed  by  a  system  for  the  file  name  of  a  document 
being  saved  In  the  course  of  an  editing  session. 
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Rgure  2.11:  An  example  of  a  dynamic  default  value:  the  default  file  name  for  saving 
a  document  is  the  name  of  the  document  being  edited.  In  order  to  attract  attention,  the 
_ name  is  highlighted  in  reverse  video. _ 


2.6.2.3.  Guideline  3:  Cognitive  Load  Reduction 

The  literature  describes  many  ways  of  reducing  the  cognitive  load.  Among  them,  we 
select  the  use  of  menus  and  forms,  and  the  informative  and  immediate  feedback. 

•  Cognitive  Load  Reduction  and  Menus/Forms 
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Experiments  show  that  the  human  being  is  better  at  recognizing  than  at  recaiiing. 
Menus  and  forms,  which  present  alternatives,  are  good  alternatives  as  short-term 
memory  extensions. 

•  Cognitive  Load  Reduction  and  the  immediate  and  informative  Feedback 

Generally  speaking,  a  feedback  is  a  reaction  to  some  cause.  In  the  context  of 
human-computer  interaction,  the  feedback  is  an  output  expression  produced  by 
the  system  that  has  processed  some  user  input.  The  interpretation  of  the  feedback 
by  the  user  leads  to  the  evaluation  of  the  situation  before  carding  on  the  plan  of 
actions.  Thus,  the  feedback  has  the  responsibility  of  expressing  the  state  of  the 
system. 


Non  Informative 

IDs  Medium  Informotiue 

■pai  CB-que  to  rryTre-yuTXTJiiiiBii  / 

pas  dans  la  fenetre.  11  tape  urii; 
pour  consequence,  deux  enum  i 
(3»begin(enumerate)  explicite 
ne  change  pas  d’aspect  sur  re|| 
I'impression  du  document.  Da  ^ 
physique  n’est  que  partiellemn 

> 

> 

h 

poi  Lu  que  la  i  lyiiu  qui  Luiiituii^ 

pas  dans  la  fenetre.  11  tape  urH^ 
pour  consequence,  deux  enumiii 
®begin(enumerate)  explicite  ^ 
ne  change  pas  d’aspect  sur  rej|j| 
I’impression  du  document.  Da  ^ 
physique  n’est  que  partiellenni 

il  1  '  1  ■■  ’ '=3  Informatiue  Scroll  Bar  .  '  - 

— reramiiTexie'qui  suit  mais  le  texie  qui  suit  ne  c 

I’ecran.  11  changera  d’aspect  seulement  a  I’lmpres; 
conditions,  I’expresslon  de  I’etat  physique  n’est  qi 
Les  systemes  dits  WYSIWYG  (What  You  See  Is  Wha 
Inconvenlents  en  Indlquant  expllcitant  Immediate 
complere  I’etat  du  texte. 

La  specification  des  actions  et  la  realisation  ( 
besQins  contradlctolres 

K> 

llill 

ill 

III 

liiiiji 

0 

0  illiili 

Illiili 

liiiiiiiiiliiiliiliiiiiiliiilO  Page  2 

Figure  2.12:  How  Informative  is  informative  feedback? 


The  system  state  is  described  by  a  wide  variety  of  data  structures.  As  far 
as  human-computer  interaction  is  concerned,  the  system  state  is 
comprised  of  the  data  structures  that  are  of  interest  to  the  user.  These 
data  structures  are  those  that  match  the  psychological  variables  involved 
in  accomplishing  the  task.  The  system  feedback  has  the  responsibility  of 
presenting  these  data  structures  in  a  form  that  helps  the  evaluation.  It  is 
also  in  charge  of  immediately  informing  the  user  of  the  changes 
happening  to  such  structures. 
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The  changing  shape  of  the  cursor.is  an  example  of  immediate  feedback. 

A  cursor  shape  can  be  used  to  remind  the  user  that  a  particular  mode  is 
active  (e.g.,  drawing  or  erasing);  cursor  shapes,  such  as  the  hour  glass, 
the  wrist  watch  (or  the  cup  of  tea)  are  useful  to  inform  the  user  of  %  long 
operation.  Dynamic  techniques  such  as  progression  bars  convey  more 
Information  about  the  evolution  of  a  time-consuming  operation. 

As  an  Illustration  of  how  informative  an  informative  feedback  can  be,  we 
consider  the  presentation  of  the  psychological  variable  "page  number," 
which  Is  of  Interest  in  document  editing  tasks.  Figure  1.12  presents  three 
possible  feedbacks  for  this  variable.  On  the  top  left  comer  of  the  figure, 
the  position  of  the  elevator  In  the  scroll  bar  indicates  that  the  current 
view  is  about  half-way  in  the  document.  On  the  top  right  corner,  the 
elevator  includes  extra  information:  an  integer.  After  some  practice,  the 
user  infers  that  it  refers  to  a  page  number.  On  the  bottom  screen,  the 
user  is  fully  informed  of  the  current  position  of  the  window  in  the 
document. 

In  a  nutshell.  Informative  feedback  should  answer  the  following  user  questions 
[Nievergelt  80]:  "Where  am  I?,  What  can  I  do?.  What  have  I  done?" 

2.6.2.4.  Guideline  4:  User-Driven  Interaction 

Users  should  have  the  initiative  In  a  dialogue  with  a  computer.  This  recommendation 
stems  from  the  view  of  the  computer  as  a  tool:  the  computer  is  a  submissive  server, 
whereas  the  user  is  the  principal  actor.  Actually,  there  is  a  more  generous  view  of  the 
computer:  that  of  a  collaborator. 

In  a  collaboration,  each  partner  acts  according  to  each  one’s  competence.  In  the 
particular  case  of  human-computer  interaction,  the  computer  should  behave  as  the 
extension  of  the  user’s  skills.  It  should  let  the  user  act  freely  and  take  control 
arbitrarily.  The  difficulty  for  the  user  interface  designer  lies  in  identifying  the  transition 
points  where  control  shifts  from  the  user  to  the  computer  and  back. 

In  both  cases,  whether  the  computer  is  a  tool  or  a  collaborator,  users  should  not  be 
modeled  as  finite  state  machines.  Automata  offer  a  convenient  way  for  modeling 
relations  between  predictable  and  well-defined  states.  States  involved  in  human 
problem  solving  are  rather  unknown  and  their  relations  are  mostly  unpredictable. 
Human  problem  solving  Is  basically  opportunistic,  mixing  the  top-down  approach  with 
the  bottom-up  approach  [Hayes-Roth  79].  As  a  result,  it  must  not  be  constrained  by  an 
inflexible  model  of  interactions. 

To  summarize,  give  the  user  the  illusion  of  driving  the  system. 

2.6.2.5.  Guideline  5:  Flexibility 

Flexibility  Is  mainly  concerned  with  the  notions  of  customization  and  multiple  rendition 
of  a  concept. 

•  Flexibility  and  Customization 

Customization  is  the  adaptation  of  the  user  interface  to  the  user.  A  user 
interface  can  be  adaptative  or  adaptable.  An  adaptative  user  interface 
automatically  evolves  depending  on  the  user’s  mental  state.  An 
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adaptable  user  interface  is  manually  modified  to  fit  the  user's 
requirements.  In  both  cases,  whether  the  user  interface  is  adaptative  or 
adaptable,  the  current  facilities  for  customization  are  rather  limited. 

An  adaptative  user  interface  relies  on  the  existence  of  an  Intelligent 
observer  that  tracks  the  actions  of  the  user,  infers  the  user's  mental  state 
and  modifies  its  behavior  accordingly.  The  notion  of  intelligent  observer 
supports  the  view  of  the  computer  as  a  collaborator.  Unfortunately,  the 
realization  of  an  effective  observer  relies  on  a  thorough  understanding  of 
human  cognitive  behavior.  Given  our  limited  knowledge  in  this  domain, 
a  lot  of  research  needs  to  be  pursued  In  the  area  of  adaptative  user 
interface.  Currently,  a  more  practical  approach  is  the  manual  adaptation 
of  user  interfaces. 

An  adaptable  user  interface  relies  on  the  existence  of  a  software 
architecture  that  makes  a  distinction  between  functional  mechanisms 
from  presentation  policies.  Functional  mechanisms  implement  the  high 
level  semantics  of  the  interaction,  whereas  presentation  policies  deal 
with  the  syntactic  and  lexical  issues.  A  software  architecture  that 
satisfies  this  requirement  makes  possible  the  modification  of  the 
syntactic  and  lexical  aspects  of  the  system  without  side  effects  on  the 
internal  functioning.  For  example,  it  is  easy  to  repair  the  "surface"  of  the 
interaction,  such  as  changing  a  command  or  a  parameter  name,  without 
any  code  recompilation.  Although  it  is  possible  to  modify  the  lexical  and 
syntactic  aspects  of  the  presentation,  it  is  not  possible  to  change  the 
structuring  of  the  interaction.  This  issue  is  the  topic  of  Guideline  6. 

Other  complementary  approaches  to  customization  Include  facilities  for 
building  new  commands  (macrocommands)  and  defining  abbreviations. 
These  two  aspects  have  already  been  discussed  in  2.6.2.2. 

A  priori  customization  seems  to  conflict  with  consistency.  In  analogy  to 
architectural  design,  a  framework  Is  provided  that  can  be  moderately 
reorganized  and  decorated  as  desired:  It  will  be  possible  to  change  the 
location  of  a  secondary  wall  but  certainly  not  the  location  of  a  wall  that 
supports  the  building.  It  Is  also  possible  to  choose  wallpaper  and 
caipeting.  because  It  Is  Independent  of  the  framework.  Similarly,  with  an 
appropriate  software  architecture,  it  is  possible  to  change  the  lexical  and 
syntactic  aspects  of  the  interactive  system  without  damaging  the  overall 
organization  that  Is  the  referential  framework  for  consistency. 

•  Flexibility  and  Multiple  Rendition 

Multiple  rendition  is  a  facility  for  multiple,  possibly  simultaneous  views  of 
a  given  concept.  Each  view  matches  a  particular  need  at  some  stage  of 
a  given  task.  For  example.  In  text  editing,  it  could  be  possible  to  view  the 
document  as  a  table  of  contents  and  simultaneously  read  a  particular 
chapter  or  subsection.  The  table  of  contents  and  the  subsection  are  two 
views  of  the  same  data  structure  that  represents  the  document. 

Figure  2.13  gives  an  example  of  a  multiple  representation  of  the  same  concept. 

Chapter  4  describes  some  software  techniques  that  support  multiple  rendition. 
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Rgure  2.13:  Multiple  representation  of  the  same  concept.  On  the  left,  data  are 
presented  as  bar  charts;  on  the  right,  the  same  data  are  collected  in  a  table. 


2.6.2.6.  Guideline  6:  Structured  dialogue 

Structuring  Is  a  general  technique  for  mastering  complexity.  Dialogue  structuring 
consists  of  organizing  the  command  space  into  layers  of  increasing  complexity.  By 
doing  so,  the  novice  user  is  able  to  successfully  accomplish  simple  tasks  that  are 
presented  right  away  in  the  system  Image.  As  the  system  becomes  more  familiar,  the 
user  will  gradually  discover  new  functions  more  complex  to  handle  but  not  necessarily 
mandatory  to  get  usual  tasks  done.  Dialogue  structuring  Into  levels  of  Increasing 
complexity  Is  known  as  the  "training  wheels"  technique  [Carroll  84]. 

This  principle  of  dialogue  structuring,  which  has  the  nice  effect  of  leading  to  successful 
task  closure  (feeling  of  relief,  satisfaction  of  work  done).  Is  certainly  not  easy  to  put  into 
practice.  It  requires  a  thorough  task  and  user  analysis  which  is  not  often  performed  by 
computer  scientists. 

2.6.2.7.  Guideline  7:  Error  Prediction 

Errorless  interaction  is  illusory,  but  the  computer  system  can  provide  support  for  error 
detection  and  error  recovery.  D.  Norman,  [Norman  86]  identifies  two  classes  of  errors: 
mistakes  and  slips.  A  mistake  results  from  the  formulation  of  an  Inappropriate 
intention.  A  slip  is  an  unintended  action.  Both  of  them,  mistakes  and  slips,  generally 
come  from  the  inadequacy  of  the  system  image.  The  system  image  should  minimize 
error  occurrences,  and  facilitate  error  detection  and  error  repair. 

•  Support  for  minimizing  errors  and  for  improving  detection 

Occurrences  of  errors  can  be  minimized  and  error  detection  can  be 
Improved  In  several  ways:  an  appropriate  metaphor  of  Interaction,  an 
adequate  terminology,  and  an  immediate  and  informative  feedback. 

When  considering  slips  only,  techniques  dealing  with  concision  avoid 
slips  by  allowing  the  user  to  reuse  information  without  any  risks  of  enter 
incorrect  data. 

A  metaphor  of  Interaction  defines  a  model  to  which  a  novice  user 
can  refer  by  analogy  to  interact  with  the  system.  There  are 
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currently  two  major  metaphors  for  interaction;  the  world 
metaphor  and  the  conversation  metaphor  [Hutchins  86].  The 
world  metaphor  electronically  mimics  objects  of  the  real  world.  A 
popular  example  of  the  world  metaphor  is  the  desktop  metaphor, 
where  icons  represent  actual  folders  and  documents,  and  where 
the  mouse  is  the  electronic  extension  of  the  hand.  The 
conversation  metaphor  is  based  on  a  linguistic  description  of  the 
actions  to  be  performed  on  system  objects.  Examples  of  the 
conversation  metaphor  Include  the  textual  command  languages 
such  as  the  Unix  Shell.  In  the  conversation  metaphor,  the  user 
talks  about  an  Implicit  world  (the  user  describes  what  Is  to  be 
done),  whereas.  In  the  world  metaphor,  the  user  directly 
manipulates  objects  (the  user  does  not  tell  how  to  do  it,  but  does 
it  instead).  Thus,  "direct  engagement"  of  the  user  shortens  the 
gulf  between  mental  and  computerized  representations.  It 
should  minimize  error  occurrences.  However,  in  cases  where 
there  is  a  mismatch  between  the  metaphor  and  its  electronic 
implementation,  errors  might  be  created  rather  than  reduced. 
Consequently,  care  should  be  taken  to  make  clear  the  limits  of 
the  metaphor  used. 

Adequate  terminology  has  to  do  with  the  choice  of  names. 
Consistency  is  an  important  feature  In  naming  but  the  temis 
should  be  understandable  to  the  user.  Well  designed  software 
architecture  combined  with  tools  for  lexical  and  syntactic 
customization  can  overcome  an  inappropriate  wording. 

Immediate  and  Informative  feedback  has  been  discussed  in 
2.6.2.3  in  relation  to  reducing  cognitive  load.  With  regard  to 
errors,  feedback  may  avoid  slips  such  as  forgetting  the  current 
mode  of  interaction.  It  may  protect  the  user  from  making  wrong 
decisions  or  wrong  inferences. 

•  Support  for  Easy  Repair 

Error  repair  is  a  problem  solving  activity.  The  support  for  such  activity 
comes  in  several  forms.  It  Includes  undo/redo  facilities  and  informative 
error  messages. 

The  combination  of  undo  and  redo  facilities,  not  only  avoid 
possible  slips  during  the  respecIfIcation  of  a  command,  but  also 
encourage  investigation.  As  such,  they  provide  the  user  with  an 
effective  support  for  problem  solving. 

Error  messages,  such  as  "SYNTAX  ERRORI,"  are  useful  for  error 
detection  but  are  far  from  being  helpful  for  error  repair.  They 
require  a  rather  fastidious  and  sometimes  frustrating  evaluation 
phase.  Error  messages  should  clearly  express  the  exact  cause 
of  the  error  and  provide  the  user  with  additional  information 
about  state  variables  relevant  to  the  current  problem.  Figure 
2.14  illustrates  the  case  of  an  error  message  helpful  for  error  ' 
repair. 
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There  isn't  enough  room  on  the  disk  to 
dupiicate  or  copy  the  seiected  items 
(additionai  145,408  bytes  needed). 

Figure  2.14:  An  example  of  an  informative  error  message  for  easy  repair.  The 
system  makes  explicit  the  cause  of  error  and  provides  the  user  with  additionai 
information  useful  in  the  repair  probiem  soiving  task. 
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3.  The  Levels  of  Abstractions  in  Interactive  Software 


This  chapter: 

*  identifies  the  abstractions  invoived  in  constructing  interactive  software. 

•  Introduces  the  terminology  that  will  be  used  in  the  remainder  of  this 
document. 


3.1.  Introduction 

An  interactive  system  calls  for  various  levels  of  services,  ranging  from  iow-ievei 
physical  I/O  handling  to  the  high-level  management  of  the  interaction.  As  shown  in 
Figure  3.1,  these  services  may  be  viewed  as  a  hierarchy  of  abstract  machines: 

•  At  the  bottom  of  the  hierarchy,  device  drivers  directly  control  the  physical 
devices.  A  device  driver  is  a  program  tailored  to  the  physical  functioning 
of  a  particular  class  of  devices,  interactive  software  includes  a  driver  for 
each  class  of  devices  it  supports.  Generally,  these  drivers  are  part  of  the 
underlying  operating  system.  They  define  the  device  dependent  layer. 

•  The  next  layer  hides  the  diversity  and  the  functioning  of  the  physical 
devices  by  defining  a  virtual  terminal.  A  virtual  terminal  provides  client 
programs  with  device  Independence  but  is  not  able  to  support  device 
sharing. 

•  Device  sharing  between  multiple  software  activities  is  implemented  by 
window  systems.  Window  systems  give  client  programs  the  illusion  of 
being  the  unique  owners  of  one  (or  several)  virtual  terminal(s).  Virtual 
terminals  are  programmable  at  a  fairly  low  level  of  abstraction.  This 
level  may  not  be  convenient  for  client  programs  which  deal  with  highly 
structured  data. 

•  Abstract  image  machines  shorten  the  gap  between  the  internal 
representations  used  by  client  programs  and  the  external 
representations  required  by  the  graphics  package  provided  by  (or  sitting 
on  top  of)  the  window  system.  An  abstract  image  Is  an  Intermediate  data 
structure  which  expresses  output  rendition  at  a  high  level  of  abstractio'i 
and  which  supports  high  level  input  facilities,  inputs  and  outputs, 
whether  they  are  expressed  at  a  high  level  of  abstraction  or  not,  require 
some  kind  of  control  that  organizes  their  occurrence. 

•  dialogue  control  shapes  the  interaction  between  the  application  and  the 
user  down  the  way  through  the  underlying  abstract  machines.  The 
dialogue  machine  can  be  seen  as  a  mediator  between  the  application 
and  the  user.  It  bridges  the  gap  between  the  abstract,  media 
Independent  world  of  the  application  and  the  universe  that  makes  up  the 
user  interface. 

•  At  the  very  top  of  the  hierarchy,  the  application  implements  the  functional 
core  of-the  interactive  system.  This  core  is  media  independent,  that  is, 
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•  At  the  very  top  of  the  hierarchy,  the  application  implements  the  functional 
core  of  the  interactive  system.  This  core  is  media  independent,  that  is, 
has  no  knowledge  of  the  way  its  data  structures  and  functions  are 
exposed  to  the  user.  Its  purpose  is  to  implement  an  expertise  in  a 
specific  domain  that  will  allow  the  user  to  perform  specific  tasks  in  that 
domain.  It  is  not  concerned  by  how  this  expertise  is  made  accessible  to 
the  user. 


The  following  sections  detail  the  nature  of  the  abstractions  that  respectively  allow  for 
device  independence,  device  sharing,  abstract  imaging  and  dialogue  management. 
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3.2.  Device  Independence 

Physical  Independence  has  multiple  facets.  It  primarily  comes  in  the  form  of  a  virtual 
terminal  that  hides  the  actual  functioning  of  quite  different  I/O  devices  without  modifying 
client  programs.  It  may  also  allow  for  the  addition  or  suppression  of  new  devices 
without  recompiling  or  even  relinking  the  existing  code.  In  this  section,  the  focus  Is  on 
code  reusability.  We  first  identify  the  problems  due  to  the  diversity  of  physical  devices. 
We  then  sketch  the  principles  of  how  physical  Independence  Is  achieved. 


3.2.1.  The  Problem 

Reading  or  writing  data  with  direct  control  of  the  physical  terminal  presents  two 
difficulties:  first,  It  Imposes  a  precise  knowledge  of  the  functioning  of  the  physical 
devices  on  the  programmer;  second,  and  more  important,  it  compromises  software 
portability.  For  example,  to  move  the  cursor  to  "linel  ,column2"  of  the  physical  screen,  a 
programmer  would  provide  a  VT100  driver  with  the  sequence  "ESC[1 :2f."  Clearly,  this 
sequence  becomes  obsolete  when  the  VT100  is  replaced  by  a  bitmap  display. 

The  software  solution  to  the  diversity  and  the  complexity  is  the  use  of  the  abstraction 
mechanism.  In  the  case  of  interest,  the  abstraction  is  a  virtual  terminal  which  provides 
client  programs  with  a  unified  and  a  simplified  view  of  actual  terminals. 


3.2.2.  The  Notion  of  Virtual  Terminal 

A  virtual  terminal  is  an  abstract  terminal.  As  such,  it  provides  client  programs  with  an 
instruction  set  for  expressing  inputs  and  outputs,  and  the  instruction  set  can  be 
mapped  to  a  variety  of  physical  terminals.  Let's  see  the  principles  of  these  I/O 
primitives. 


Figure  3.2  illustrates  the  principles  of  output  operations  of  a  virtual  terminal.  In  this 
example,  the  primitive  SetCursor  issued  by  the  client  program  moves  the  cursor  to 
location  (2:3)  in  the  virtual  space  coordinate.  The  virtual  terminal,  whose  job  is  the 
interpretation  of  primitives  from  client  programs,  translates  the  virtual  location  into  the 
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physical  coordinate  space  and  calls  device  dependent  primitives.  These  primitives 
correspond  to  the  physical  device  that  is  currently  linked  to  the  client  program. 

This  example  mentions  the  notion  of  space  coordinate.  Virtual  space  coordinates  may 
be  integer  or  fractionary  systems.  The  choice  between  the  two  is  based  on 
compromises  between  ease  of  Implementation  and  effectiveness  of  physical 
independence.  As  an  example  of  compromise,  the  virtual  terminal  defined  in  X- 
Windows  Is  based  on  the  hypothesis  that  physical  screens  have  square  pixels.  As  a 
result,  the  primitive  that  Is  supposed  to  draw  a  circle  produces  an  ellipsis  on  screens 
whose  pixels  are  rectangular  (such  as  the  Apple  Lisa  or  TV  screens). 

For  inputs,  the  chaos  due  to  the  diversity  of  physical  devices  (keyboards,  mouse, 
electronic  glove,  etc.)  has  been  organized  in  the  form  of  typed  classes.  The  types  are 
specific  to  a  particular  Implementation  of  a  virtual  terminal.  In  general,  they  include: 

•  The  class  key,  which  models  physical  keyboards. 

•  The  class  locator  to  denote  the  location  of  a  pixel  in  the  virtual 
coordinate  space. 

-  The  class  choice,  which  returns  an  integer  value  useful  to  represent 
mouse  buttons. 

•  The  class  valuator  to  model  physical  devices  such  as  potentiometers 
that  generate  real  values. 

•  The  class  modifier,  a  bit  string  whose  value  can  be  interpreted  as  a 
modifier  of  the  semantics  of  the  value  returned  by  other  classes. 

•  The  class  application  to  allow  client  programs  to  synthesize  client- 
dependent  events. 

•  The  class  time-stamp  to  indicate  the  time  when  a  physical  action 
happened. 

Input  classes  such  as  locator,  choice  and  valuator,  were  first  introduced  by  GKS  [ISO 
85]  and  Core.  Today,  they  are  implicitly  embedded  in  the  device-independent  layer 
provided  by  window  systems.  Other  classes,  such  as  modifier,  application  and  time- 
stamp  classes  have  been  made  popular  by  window  systems.  The  last  two  deserve 
additional  comments: 

•  The  application  class  allows  client  programs  to  extend  the  basic  set  of 
input  classes.  Application  programs  can  set  up  their  own  protocol  of 
communication  by  defining  special  purpose  events,  and  exploit  the 
communication  mechanism  provided  by  the  window  system.  This 
feature  is  an  interesting  property  of  the  Macintosh  event  manager. 

•  The  time-stamp  class  is  useful  to  overcome  two  types  of  hardware 
limitations.  The  first  limitation  Is  the  sequentiality  of  the  Interrupt 
mechanism:  two  actions  that  appear  as  simultaneous  to  the  user  are 
reflected  to  the  client  program  as  two  separate  events.  A  time-stamp 
value  may  be  considered  a  means  to  glue  the  events  back  into  a  single 
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abstract  one.  The  second  limitation  happens  at  the  physical  device 
itself.  One  well-known  example  is  the  one-button  mouse  of  the 
Macintosh,  which  can  be  used  as  a  two-button  (or  even  a  three-button) 
mouse  by  double  (or  triple)  clicking  the  unique  physical  button.  Again, 
time-stamps  associated  with  the  successive  events  allow  for 
synthesizing  events. 

To  complete  our  picture  of  the  functioning  of  a  virtual  terminal  undertaken  in  Figure 
3.2,  we  need  to  observe  Figure  3.3.  The  client  program  acquires  an  event  through  the 
GetEvent  primitive  provided  by  the  virtual  terminal.  This  event  is  a  device-independent 
description  of  some  action  performed  by  the  user  on  physical  input  devices.  The  job  of 
the  client  program  is  to  interpret  the  content  of  the  description.  The  job  of  the  virtual 
terminal  is  to  build  the  abstract  representation  of  the  physical  events,  in  the  example  of 
Figure  3.3,  the  returned  event  Is  a  combination  of  a  locator  and  a  choice  that  represents 
the  screen  location  pointed  to  by  the  user  with  a  mouse. 


To  summarize,  device  independence  is: 

•  Primarily  intended  to  increase  software  portability  by  allowing  the 
substitution  of  physical  devices  without  damaging  existing  code. 
Although  this  capability  is  a  desirable  feature  for  programmers,  it  should 
be  stressed  that,  from  the  point  of  view  of  the  user,  physical  devices  are 
not  equivalent.  Card,  Moran  and  Neweii  suggested  [Card  83]  that  the 
mouse  is  adequate  for  the  selection  of  2D  objects,  whereas  the  joystick 
Is  more  appropriate  for  3D  manipulations. 

•  Embedded  in  window  systems. 

•  Hard  to  achieve  fully. 
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3.3.  Device  Sharing 


Device  sharing  is  buiit  on  device  independence,  its  purpose  is  to  make  avaiiabie  not  a 
singie  virtual  terminal  as  device  independence  does,  but  instances  of  virtual  terminals. 
Why  is  this  interesting?  What  are  the  principles  of  its  realization  in  windowing  systems, 
and  what  is  the  trend  In  current  windowing  systems? 

3.3.1.  Justification 

A  virtual  terminal,  as  provided  by  the  device  independence  layer,  is  a  resource  that 
may  be  simultaneously  accessed  by  multiple  activities.  In  the  late  sixties, 
multiprocessing  was  not  accessible  to  the  user.  Processes  were  Internal  creatures  that 
helped  the  system  do  Its  job.  Today,  the  user  can  explicitly  or  implicitly  launch  multiple 
activities,  all  of  which  act  as  producers  and  consumers  of  the  terminal.  In  the  same  way 
that  system  engineers  Introduced  the  notion  of  virtual  resources  (e.g.  virtual  memory)  to 
extend  the  capabilities  of  the  core  hardware  components.  Interactive  software 
engineers  defined  windowing  systems  to  extend  the  capability  of  terminals. 
Windowing  systems  behave  like  virtual  resource  generators  by  providing  client 
programs  with  any  number  of  virtual  terminals. 

Device  sharing  is  necessary  for  multiprocessing  environments.  Whether  the 
environment  is  multiprocess  or  not,  it  is  also  useful  as  a  technique  for  organizing 
information  on  the  screen:  output  expressions  that  are  linked  by  some  logical  criteria 
need  to  be  physically  gathered  on  the  screen.  Regions  that  result  from  such  grouping 
compete  for  rendition.  This  competition  also  occurs  for  input  events  which  are  to  be 
dispatched  to  the  appropriate  destinatary  (process,  region,  etc.).  This  is  the  familiar 
multiplexing/demultiplexing  problem  that  Is  commonly  encountered  In  operating 
systems.  In  the  case  of  interest,  the  solution  to  the  problem  is  based  on  the  notion  of 
window. 


3.3.2.  The  Window  at  the  Center  of  the  Muitipiexing/Demuitiplexing 
Mechanism 

The  notion  of  window  and  the  terminology  vary  widely  from  one  window  system  to 
another.  It  is  necessary  to  distinguish  between  the  window  as  the  elementary  drawing 
surface  that  Is  mapped  onto  the  screen,  the  notion  of  drawing  surface  that  needs  not  be 
mapped  onto  the  screen,  and  the  window  asihe  object  that  the  user  manipulates.  The 
way  these  notions  are  implemented  and  organized  together  depends  on  the  window 
system.  Tiie  primary  interest  in  this  section  is  the  window  as  the  elementary  drawing 
surface  mapp^  onto  the  screen.  More  details  are  provided  in  Chapter  4. 

A  window  as  an  elementary  drawing  surface  is  a  drawing  context.  This  context 
includes  a  set  of  pixels  and  a  system  coordinate  space.  The  set  of  pixels  is  used  for 
rendering  output  expressions  and  for  returning  pixel  locations  expressed  In  the  window 
coordinate  space.  To  take  advantage  of  possible  hardware  support  for  raster 
operations,  the  set  of  pixels  usually  forms  a  rectangular  area.  This  area  is  conceptually 
visible  on  the  screen  and  defines  the  sharing  unit. 

Sharing  uses  the  notion  of  owning:  a  window  has  an  owner  (e.g.  a  particular  process). 
The  way  tiie  owner  Is  Identified  is  out  the  scope  of  this  subsection.  Output  requests 
issued  by  client  processes  are  demultiplexed  by  the  window  system;  input  requests  are 
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multiplexed.  For  output,  the  window  system  clips  any  information  that  lies  outside  the 
drawing  area  of  a  window.  Input  events  are  dispatched  according  to  two  possible 
techniques.  Either  the  window  system,  such  as  NeWS  [SUN  87],  broadcasts  the  event 
to  all  of  the  windows  that  have  expressed  their  Interest  in  this  type  of  event,  or,  such  as 
X  Windows  [Scheifler  86],  sends  the  event  to  the  "current  focus  window."  The  fact  that 
a  particular  window  is  the  current  focus  for  keyboard  events  or  for  any  combination  of 
typed  events  Is  decided  by  some  client  process  by  Issuing  the  appropriate  request. 

3.3.3.  Trends  in  Windowing  Systems 

At  first  sight,  window  managers  look  very  similar:  client  programs  can  create  windows, 
move  windows,  resize  windows,  etc.  Nevertheless,  there  Is  no  common  terminology; 
the  basic  functional  concepts  such  as  the  way  events  are  dispatched  differ  profoundly 
for  one  windowing  system  to  another;  as  showed  in  Chapter  4,  windowing  systems 
also  differ  in  their  architecture.  However,  the  new  generation  of  window  systems 
illustrated  by  NeWS  and  X-Windows  have  several  features  in  common: 

•  Presentation  policies  are  distinct  from  functional  services.  By  doing  so, 
the  "look  and  feel"  of  windows  can  be  customized  without  changing  the 
code  that  implements  terminal  resource  sharing; 

•  A  server  is  in  charge  of  the  execution  of  the  functional  services.  By 
doing  so,  client  programs  and  the  windowing  system  need  not  be 
running  on  the  same  physical  machine  and  client  programs  can  create 
remote  windows. 

These  issues  will  be  further  discussed  in  Chapter  4.  To  summarize  the  topic  about 
device  sharing,  we  take  the  point  of  view  of  the  user.  Window  systems  allow  the  user  to: 

•  Carry  several  activities  concurrently. 

•  Gather  information  that  are  semantically  connected  by  some 
psychological  or  task  criteria. 

•  Ask  for  multiple  views  of  the  same  concepts  in  distinct  regions  of  the 
screen. 

So  far,  we  have  identified  and  described  abstractions  that  make  possible  the 
expression  of  inputs  and  outputs  in  a  device  independent  way  and  without  any  risk  that 
a  client  process  will  damage  other  processes  space.  We  need  now  to  analyze  the 
information  that  is  carried  by  these  expressions. 

3.4.  Abstract  Imaging 


Abstract  imaging  is  a  technique  for  acquiring  inputs  and  for  rendering  outputs  at  a  level 
of  abstraction  compatible  with  client  programs  and  windowing  systems  requirements. 
The  problem  posed  by  input  and  output  operations  is  identifi^  in  the  next  paragraph. 
The  principles  of  one  possible  solution  Is  then  presented. 
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3.4.1.  The  Problem 

Primitives  provided  by  window  systems  for  the  expression  of  I/O's  are  device 
independent  but  the  concepts  they  manipulate  generally  lie  at  a  fairly  low  level  of 
abstractions:  pixels,  lines,  circles,  rectangles,  splines  are  the  usual  notions.  At  most, 
one  finds  the  encapsulation  of  graphical  requests  in  very  much  the  same  way  a 
macrocommand  denotes  a  set  of  commands:  PostScript  proposes  the  notion  of  path 
[SUN  87],  QuickDraw  implements  the  notions  of  region  and  picture  [Rose  86],  and  GKS 
the  notion  of  segment  [ISO  85].  These  encapsulations  help  structuring  the  output 
expressions  but  the  operators  they  allow  are  very  limited  in  scope.  In  particular,  the 
entity  described  in  a  graphic  macro  can  be  rotated,  enlarged,  moved  as  a  whole  but  its 
content  cannot  be  dynamically  modified.  This  restriction  is  in  conflict  with  the  editing 
nature  of  interaction. 

in  Chapter  5,  we  will  describe  tools  that  are  more  appropriate  for  this  sort  of 
requirement.  For  now,  the  principles  of  the  approach  is  presented  in  the  next 
subsection. 

3.4.2.  The  Principles  of  the  Notion  of  Abstract  Image 

The  purpose  of  an  abstract  Image  Is  to  hide  the  functioning  of  the  virtual  terminal.  An 
abstract  image  is  a  data  structure  that  acts  as  a  mediator  between  some  client  data 
structure  to  be  exposed  to  the  user,  and  a  real  image  expressed  In  terms  of  some 
graphics  package.  The  exact  nature,  of  abstract  Images  will  be  made  more  explicit  In 
the  Chapter  5.  For  now,  we  Jimit  the  description  to  the  principles.  Figure  3.4  illustrates 
the  role  of  an  abstract  Image. 

The  client  program  builds  an  abstract  image  that  represents  a  domain-dependent  data 
structure.  The  abstract  Image  Is  automatically  processed  by  an  abstract  image 
machine.  This  machine  generates  graphic  requests  that  are  Interpreted  by  the 
underlying  graphics  package.  The  "concrete"  or  real  image  can  be  produced  either  In 
an  offscreen  bitmap,  or  in  a  window,  if  there  is  no  windowing  system,  then  the  Image 
must  be  generated  on  the  physical  screen.  The  choice  between  the  two  first 
techniques  depends  on  the  faculties  provided  by  the  window  system. 


42 


CMU/SEI-89-TR-4 


DOMAIN  DEPENDEI^ 

DATA  STRUCTURE  ABSTRACT  IMAGE 


b)Window 


if  (a>b)  then 
a  >a+l 
else 
b  :»c; 


c)  Physical  Screen 


if  (a>b)  then 
a  >a+l 
else 
b>c: 

if  (a>c)  then 
begin 
a  >a+l : 
b  >b+l : 


REAL  IMAGE 

a)  Offscreen  Bitmap 

PHYSICAL  SCREEN 

If  (a>b)  then  a  ;-a+1  else  b  :-c; 
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1  if  (a>c)  then  I 
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1  begin  ■ 

b  r-b+i : 

end 

Figure  3.4:  An  abstract  image  is  a  mediator  between  a  domain  dependent 

data  structure  and  a  real  image. 


For  some  windowing  systems  such  as  X  Windows-V10,  offscreen  bitmaps  accept  a  very 
limited  set  of  operations.  In  particular,  it  is  impossible  to  draw  on  an  offscreen  bitmap, 
but  it  is  possible  to  solely  fill  it  with  pixels  with  raster  operations.  In  such  circumstances, 
the  real  image  must  be  produced  in  a  window.  The  advantage  of  an  offscreen  bitmap 
over  the  direct  mapping  in  a  window.  Is  its  use  as  a  "visual  cache."  An  offscreen 
bitmap  can  be  larger  that  a  window.  As  a  result,  it  may  contain  extra  information  useful 
for  repainting  the  content  of  an  enlarged  or  scrolled  window. 

Abstract  imaging  is  not  only  useful  for  hiding  the  functioning  of  the  virtual  terminal  and 
for  processing  syntactic  user  tasks  such  as  scrolling  and  resizing  windows,  but  also  as 
a  convenient  technique  for  multiple  rendition  of  a  given  concept.  The  capability  for  the 
user  to  observe  different  views  of  the  same  concept  enhances  the  flexibility  of  the 
interaction.  (Flexibility  is  one  of  the  ergonomics  rules  described  in  1.6.2).  Figure  3.5 
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shows  how  multiple  views  are  obtained  from  the  same  abstract  image.  In  the  example, 
the  client  data  structure  represents  the  concept  of  a  house,  its  corresponding  abstract 
image  is  interpreted  in  two  ways.  One  possible  interpretation  provides  a  picture  of  the 
house  as  a  floor  plan  with  a  lot  of  details  (room  names  and  furniture),  in  the  second 
interpretation,  irrelevant  details  are  suppressed.  Both  real  images  may  be 
simuitaneousiy  visible  on  the  screen  and  both  are  automatically  updated  as  the 
abstract  Image  is  modified. 


BEAL  IMAGES  ON 
OFFSCREEN  BITMAPS 


ABSTRACT  IMAGE  ! 
— 


I 

1  Bed-rootrO 

1  Kitchen 

1  m 

~Bath 

PHYSICAL  SCREBM 


Rgure  3.5:  Multiple  presentations  of  the  same  concept 
provided  by  an  abstract  Image. 


So  far,  we  have  described  the  contribution  of  abstract  images  as  an  output  mechanism. 
Figure  3.6  shows  how  input  is  processed:  from  the  selection  of  a  point  on  the  screen  to 
the  concept  of  the  client  program.  Let  (x,y)  be  the  location  of  the  point  in  the  screen 
coordinate  system.  The  selection  is  first  interpreted  by  the  windowing  system  as  a 
location  (x',  y')  relative  to  the  top  window  which  owns  (x,y).  The  abstract  Image 
machine  receives  a  triple  which  identifies  the  window  and  a  point  (x’,  /)  in  this  window. 
The  window  identification  allows  the  abstract  image  machine  to  identify  the  abstract 
image,  and  (x’,  /)  allows  it  to  determine  which  item  of  the  abstract  image  owns  the 
selected  point,  if  an  item  is  linked  to  a  concept  or  a  part  of  a  concept,  then  the  client 
program  is  directly  informed  of  which  concept  element  the  user  has  selected. 
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While  abstract  Imaging  automatically  translates  low-level  input  information  into  client 
dependent  concepts  and  takes  care  of  multiple  rendition  and  window  resizing  and 
scrolling,  abstract  Imaging  does  not  control  the  interaction  between  the  user  and  the 
application.  This  task  Is  the  purpose  of  dialogue  handling  Introduced  In  the  next 
section. 

3.5.  Dialogue  Handling 


Dialogue  handling  Is  concerned  with  the  control  and  the  maintenance  of  the 
Interaction.  This  section  introduces  this  issue  by  making  the  analogy  with  actual 
dialogues  between  human  beings.  Dialogue  handling  is  then  discussed  from  the 
computer  side,  stressing  the  fact  that  the  responsibility  of  the  interaction  must  be  shared 
between  the  application  and  the  user  interface  itself. 

3.5.1.  Introduction 

In  a  conversation  between  individuals,  the  control  of  the  dialogue  is  distributed  among 
the  partners.  At  some  point  In  time,  one  of  the  partners  initiates  the  dialogue  by 
submitting  an  expression  or  a  sequence  of  expressions  to  interlocutors.  The 
expressions  are  processed  by  the  partners  and  new  expressions  are  produced  as 
results  of  the  processing.  Expressions  are  not  elaborated  by  chance.  Their  meaning 
and  their  syntax  depend  on  the  mental  representation  that  each  partner  maintains  of 
interlocutors  in  the  dialogue. 

The  interaction  between  a  computer  system  and  a  human  being  should  be  organized 
in  a  similar  way.  The  control  of  the  dialogue  should  be  ruled  according  to  the 
respective  competence  of  the  user  and  the  computer  (see  Rule  4  about  user  driven 
interaction  in  Section  2.6.2.4).  The  user  makes  use  of  a  conceptual  model  that  gathers 
semantic  and  syntactic  knowledge  about  the  functioning  of  the  computer  system  (see 
the  definition  of  conceptual  models  In  Section  2.3).  In  short  term  memory  (see  Section 
2.1.4),  the  user  maintains  the  state  of  the  interaction.  Similarly,  the  computer  system 
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maintains  a  conceptual  model  as  well  as  the  state  of  the  interaction.  As  mentioned  in 
Section  2.6.1,  CLG  offers  the  designer  a  convenient  way  for  representing  the 
conceptual  model  and  the  state  of  the  interaction  with  the  conceptual  and  the 
communication  components. 

The  conceptual  component  describes  the  concepts  and  operations  that  can  be 
handled  by  the  user,  whereas  the  communication  component  deals  with  their 
presentation.  When  considering  the  practical  business  of  designing  a  software 
architecture,  the  conceptual  component  Is  naturally  mapped  into  a  software  component 
caiied  the  application  whereas  the  communication  component  constitutes  the  user 
interface  Itself.  Figure  3.7  shows  a  simplified  view  of  the  software  architecture  of  an 
interactive  system.  Given  this  view  as  a  basis,  dialogue  handling  Is  handled  partially  in 
the  application  and  partially  In  the  user  interface. 


3.5.2.  Dialogue  Handling  In  the  Application 

In  the  application,  the  conceptual  model  is  comprised  of  a  set  of  domain-dependent 
abstractions  that  allow  the  user  to  accomplish  domain  specific  tasks.  These 
abstractions  are  data  structures  and  operations.  This  Is  a  static  view  of  an  application. 
The  dynamic  view  Is  concerned  with  the  way  states  of  the  application  relate  to  each 
other.  A  state  Is  the  model  that  an  application  has  for  the  interaction.  It  includes: 

•  The  conditions  which  describe  its  relations  with  other  states. 

•  The  set  of  abstractions  that  are  accessible  to  the  external  world. 

For  the  application,  the  external  world  is  the  user  interface:  the  user  interface  is  its  only 
partner.  The  application  receives  requests  from  the  user  Interface  when  Its  data 
structures  need  to  be  accessed;  It  sends  output  requests  to  the  user  interface  to 
express  changes  about  its  state  and  data  structures.  As  device  Independence  is 
guaranteed  by  windowing  systems,  so  low-level  details  of  the  user's  actions  are  hidden 
from  the  application. 

3.5.3.  Dialogue  Handling  In  the  User  Interface 

In  the  user  interface,  the  conceptual  model  and  the  state  of  the  interaction  are 
maintained  in  a  set  of  agents  speciaiized  In  human-machine  interaction.  These  agents 
are  mediators  between  the  abstractions  handled  by  the  application  and  the  actions  of 
the  user.  Each  one  takes  part  to  the  interaction.  Each  one  is  a  miniature  interactive 
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system  which  handles  a  piece  of  the  conceptual  model  and  a  piece  of  the  state  of  the 
interaction.  A  judicious  collection  of  such  active  agents  defines  an  instantiation  of  a 
user  interface  for  an  application.  Considered  as  a  whole,  a  user  interface  is  a 
translator  between  the  formalism  recognized  by  the  application  and  the  formalism 
employed  by  the  user.  At  the  opposite  of  the  translation  process  involved  in  a  virtual 
terminal,  the  translation  process  involved  in  a  user  interface  is  very  difficult  to  achieve. 

Translation  between  formalisms  for  terminals  rely  on  well  understood  techniques  and 
theories  such  as  finite  state  automata.  The  translation  process  Is  easy  to  formalize 
because  the  functioning  of  the  source  and  the  target  agents  are  well  defined.  In  the 
case  of  human-computer  Interaction,  our  knowledge  about  human  behavior  is  rather 
fuzzy.  However,  we  do  know  that  human  behavior  is  not  well  modeled  by  deterministic 
computer  science  techniques.  It  Is  not  surprising  then  that  the  construction  of  user 
interfaces  is  an  active  area  of  Investigation.  Tools  for  implementing  user  Interfaces  are 
being  progressively  made  available.  Such  tools  are  the  topic  of  Chapters  4,  5  and  6. 
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4.  Windowing  Systems 


4.1.  Introduction 

Modern  computing  systems  have  multiple  simultaneous  processes  ongoing,  each  of 
whose  processes  might  have  some  interaction  with  the  end  user.  Each  process  hides 
its  interaction  from  other  processes.  The  hiding  is  accomplished,  through  the  use  of 
virtual  terminals.  Chapter  2  Introduced  the  abstractions  associated  with  the  notion  of  a 
virtual  terminal.  Multiple  virtual  terminals  all  sharing  a  single  physical  terminal  require 
management  of  the  terminal's  resources.  A  window  sytem  is  a  resource  manager  for 
the  resources  associated  with  a  particular  physical  terminal.  This  section  discusses 
some  of  the  issues  associated  with  that  resource  management,  and  then  discusses  an 
experimental  method  of  managing  the  complexity  associated  with  multiple  windows. 

The  resources  that  a  real  terminal  is  assumed  to  have  and  which  are  managed  by  the 
window  manager  are: 

•  High  resolution  screen.  The  screen  can  be  bitmapped,  raster  or  vector. 

•  Keyboard. 

•  Pointing  device.  A  multibutton  mouse  is  the  most  common  pointing 
device,  but  joysticks,  track  balls  and  various  gesturing  devices  also 
exist. 

•  Graphic  context.  The  color  map  for  a  particular  terminal  determines 
which  bit  patterns  represent  which  colors.  The  graphic  context 
determines  other  stataic  information  such  as  style  and  thickness  of  lines. 

4.2.  Virtual  Terminal 

As  introduced  in  Section  3.2.2,  Figure  4.1  gives  a  picture  of  a  single  client  interacting 
with  a  physical  terminal.  The  client  provides,  at  some  level  of  abstraction,  images  that 
are  displayed  on  the  screen  and  handles,  again  at  some  level  of  abstraction,  inputs  that 
come  from  the  keyboard  and  the  mouse.  As  a  way  of  making  concrete  the  hierarchy  of 
abstract  machines  introduced  in  Chapter  3,  consider  the  user  action  of  selecting  an 
Image  on  the  screen.  Since  at  this  point  we  have  a  client  Interacting  directly  with  a 
physical  device,  the  virtual  machines  that  are  of  concern  are  the  device  driver  and  the 
terminal  handler. 
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Rgure  4.1 :  Single  Client  System. 


The  current  cursor  position  is  displayed  through  some  image  on  the  screen.  The  user 
moves  the  mouse.  With  each  increment  of  movement,  the  physical  controller  generates 
a  message  to  the  device  driver  software.  This  software  calculates  the  current  pixel 
location  of  the  mouse  and  reports  the  location  to  the  terminal  handler.  The  terminal 
handler  generates  Instructions  to  move  the  cursor  image  to  a  new  position  on  the 
screen  and  passes  those  Instructions  to  the  device  driver  which  generates  the  new  bit 
map  to  be  displayed.  When  the  user  performs  a  button  down,  an  interrupt  Is  generated, 
the  Interrupt  is  passed  through  to  the  terminal  handler.  The  terminal  handler  then 
informs  the  client  of  a  button  down  operation  that  occurred  at  a  particular  location  on  the 
screen.  When  the  button  Is  released,  the  terminal  handler  Is  informed  of  another 
interrupt  and.  In  turn.  Informs  the  client  of  a  button  up  at  a  particular  location. 

Note  here  several  themes  which  will  reoccur.  The  first  is  that  the  feedback  associated 
with  the  movement  of  the  mouse  and  reflected  in  the  movement  of  the  cursor  is  handled 
by  the  terminal  handler.  The  second  theme  is  the  level  of  abstraction  reflected  in  the 
button  events.  The  location  of  the  cursor  Is  hidden  by  the  terminal  handler  and  is 
reported  to  the  client  only  in  association  with  another  event.  Another  example  of  the 
level  of  abstraction  of  the  terminal  handler  Is  that  it  does  not  deal  with  objects  on  the 
screen  or  with  Interpretation  of  events.  The  mapping  of  the  cursor  position  into  a 
particular  object  and  the  Interpretation  of  the  button  down,  button  up  as  a  select  are  all 
handled  at  a  higher  level  of  abstraction  than  the  terminal  handler. 
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virtual  Terminal 


Figure  4.2:  Multiple  Client  System. 


Once  the  client  becomes  one  of  a  collection  of  clients,  then  the  real  terminal  becomes  a 
virtual  terminal.  The  level  of  abstraction  managed  by  the  virual  terminal  handler  is  the 
same  as  with  a  real  terminal,  but  the  virtual  terminal  handler  must  map  the  multiple 
virtual  terminals  onto  the  single  real  terminal.  The  common  name  for  this  level  of 
abstract  machine  is  window  manager.  Figure  4.2  gives  a  representation  for  the  role  of 
the  window  manager.  Figure  4.3  shows  a  collection  of  windows. 
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Rgure  4.3:  Actual  screen. 


4.3.  Single  Window 

A  window  Is  the  screen  portion  of  the  virtual  terminal  of  a  process  and  provides  the 
output  portion  of  that  process.  Since  the  window  manager  manages  the  window  it  is  no 
longer  tied  to  the  physical  screen  size  or  shape.  The  window  may  be  represented  by 
an  Icon  (the  lower  left  comer  of  Figure  4.3  Is  an  Icon  representing  the  mailbox  used  in 
rural  areas  of  the  United  States.  It  represents  the  output  of  the  mall  process).  Windows 
may  also  have  different  sizes  and  locations  on  the  screen. 

One  of  the  virtues  of  abstracting  functionality  into  specific  locations  is  that  the 
functionality  can  then  be  embellished  without  affecting  the  remainder  of  the  client.  In 
particular  within  a  window  system,  a  window  has  decorations,  geometry,  and  content. 
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4.3.1.  Decorations 


Figure  4.4  displays  a  single  window  with  its  components  identified.  It  has  not  only  the 
window  and  its  contents,  but  also  it  has  been  decorated  with  additional  functions. 
These  functions  are: 

1.  Title  Bar.  The  window  may  have  a  title  bar  which  provides  the  end  user 
with  a  cue  as  to  the  process  that  owns  the  window.  The  size  and  title 
within  the  title  bar  can  be  set  by  the  client. 

2.  Close  Tabs.  In  the  lower  right  hand  comer  Is  a  box  that  enables  the  end 
user  to  iconify  the  window.  That  is,  when  the  end  user  selects  this  box, 
the  window  Is  turned  Into  an  Icon  by  the  window  manager  and  additional 
action  is  required  to  expand  the  window  again. 

3.  Scroll  Bars.  It  Is  possible  that  ail  the  information  that  the  client  wishes  to 
display  cannot  be  placed  simultaneously  on  the  screen.  The  scroll  bars 
allow  the  end  user  to  navigate  over  the  whole  screen  and  display  the 
portion  desired.  This  point  Is  further  explained  in  the  section  on 
geometry. 

Note  an  additional  consequence  of  performing  the  abstraction.  The  original  motive 
behind  providing  the  abstraction  was  to  relieve  the  client  of  the  management  of  lower 
level  details.  Once  the  abstraction  existed,  then  It  became  embellished  and  the  client 
now  has  to  Inform  the  abstraction  manager  (the  window  manager  in  this  case)  of 
additional  information  to  support  the  embellishments  (title  for  title  bar,  shape  when 
Iconified  In  the  example).  The  end  result  of  performing  the  abstraction  is  that  additional 
functionality  is  available  to  the  client  at  lower  cost  than  directly  implementing  that 
functionality  but  the  use  of  the  implementation  of  the  abstraction  is  not  free. 
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5; 

Figure  4.4:  Decorated  Single  Window. _ 


4.3.2.  Geometry 

The  client  interacts  with  a  virtual  terminal  with  a  single  screen  size.  One  of  the  functions 
provided  by  the  window  system  is  the  resizing  of  the  window. .  The  end  user  may 
indicate  to  the  window  system  that  a  particular  window  is  to  be  resized  and  then 
indicate  the  new  size.  The  problem  then  becomes  how  to  map  from  the  size  that  the 
client  assumes  to  be  the  size  visible  to  the  user.  Three  options  are  available. 

1.  Display  only  a  portion  of  the  client  screen  (viewport). 

2.  Resize  the  contents  to  fit  the  visible  window. 

3.  Report  to  the  client  that  the  visible  window  has  changed  size  and  allow 
the  client  to  control  the  display. 

4.3.2.1.  Viewport 

Figure  4.5  displays  the  situation  when  a  resize  has  occurred  and  the  resulting  window 
is  smaller  than  the  client's  virtual  terminal.  The  client  has  a  collection  of  information,  a 
portion  of  which  has  been  sent  to  the  virtual  terminal.  The  information  available  to  the 
virtual  terminal  represents  a  canvas  of  information  (or  an  offscreen  bitmap).  The 
information  avallaible  through  the  window  is  a  viewport  onto  the  canvas.  The 
information  is  maintained  on  the  canvas  using  the  same  scale  and  proportions  as  the 
information  sent  from  the  client 
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This  is  complete 
canvas  with 

this  portion 
visibie 

in  window 

Figure  4.5:  Canvas  concept. 

The  viewport  can  be  moved  around  on  the  canvas  presenting  the  user  with  different 
visible  portions  of  the  canvas.  This  moving  around  is  controlled,  typically,  with  the  scroll 
bars  on  the  viewport  window. 

The  distinction  between  the  Information  that  the  client  thinks  is  visibie  (the  canvas)  and 
the  information  that  is  actualiy  visible  allows  the  client  to  generate  output  to  the  portion 
of  the  canvas  that  is  obscured.  The  two  aitematives  when  this  occurs  are  to  biock  the 
client  untli  the  information  becomes  visible  to  the  end  user  or  to  allow  the  output  to  be 
placed  (logically)  on  the  obscured  portion  of  the  canvas. 

Note  that  the  size  of  the  information  on  the  canvas  does  not  change  when  a  resize 
occurs.  Only  the  portion  of  the  information  visibie  to  the  end  user  changes. 

4.3.2.2.  Resizing  Contents 

Another  option  when  resizing  a  window  Is  for  the  window  manager  to  maintain  the 
same  information  visible  to  the  end  user.  In  this  case,  the  scaie  of  the  information  must 
be  changed.  Pixei  repiication  or  sampling  techniques  are  used  to  expand  or  shrink  the 
view.  Handling  aspect  ratio  changes  (the  ratio  of  the  sides  of  the  window)  becomes  a 
very  difficuit  probiem  and  is  typicaiiy  not  deait  with  by  the  window  manager.  For 
example,  if  a  circle  is  dispiayed  in  a  window  and  the  resize  extends  the  x  direction 
without  modifying  the  y  direction,  stretching  the  image  to  fiii  the  new  window  will  result 
in  the  circie  being  dispiayed  as  an  eliipse. 

4.3.2.3.  Informing  Client 

Informing  the  client  that  a  resize  event  has  occurred  is  the  final  option  for  the  window 
system.  This  option  can  be  used  In  conjunction  with  the  other  two.  For  exampie, 
suppose  the  vie\^ort  becomes  larger  than  the  underlying  canvas.  This  ciient  may  wish 
to  enlarge  the  canvas  and  the  window  system  has  no  knowledge  of  how  this  is  to  be 
accompiished. 
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4.3.3.  Shape  of  Windows 

in  most  systems,  windows  are  rectangular.  This  simplifies  the  management  of  the 
windows  and  the  clipping  of  the  Information  within  the  canvas  to  the  window.  On  the 
other  hand,  rectanguiar  windows  make  certain  seiection  and  display  problems  difficult. 
For  example,  two  diagonal  lines  become  difficult  to  separate  with  rectangular  regions. 

in  at  least  one  system  (NeWS)  It  Is  possible  to  have  an  arbitrarily  shaped  window.  The 
boundary  of  the  window  Is  represented  by  spline  curves  and  the  canvas  is  clipped  by 
the  curves. 

4.4.  Multiple  Windows 


Window  systems  manage  multiple  virtual  terminals.  This  gives  the  end  user  a  view  of 
the  physical  screen  such  as  that  displayed  in  Figure  4.3.  The  management  of  the 
resources  of  the  physical  terminal  involves  both  the  input  portion  of  the  terminal  and  the 
output,  in  general,  the  problem  Is  to  allow  the  end  user  to  differentiate  between  the 
various  active  processes  and  provide  input  to  the  processes  as  desired. 

4.4.1.  Input  Management 

The  physical  terminal  being  managed  has  two  different  types  of  Input  devices.  These 
are  the  keyboard  and  a  pointing  device  (a  mouse  is  assumed).  Each  of  the  devices 
generates  events  of  the  classes  discussed  In  Chapter  3.  The  terms  key  event  and 
choice  event  will  be  used  to  refer  to  the  actions  of  the  key  class  and. the  choice  class. 
The  basic  problem  of  the  window  manager  is  to  direct  the  various  events  to  the 
appropriate  client  process.  Since  each  window  is  assigned  to  a  particular  process,  this 
Is  equivalent  to  directing  the  events  to  the  appropriate  window.  To  assist  the  end  user 
in  determining  the  current  state  of  the  window  manager  two  different  types  of  cues  are 
used — the  mouse  cursor  and  the  text  cursor.  Each  provides  the  location  of  one  of  the 
types  of  input  devices.  Together,  they  determine  to  which  active  process  an  Input  event 
is  directed. 


4.4.1 .1.  Mouse  Cursor 

The  mouse  is  assumed  to  have  a  single  position  within  the  physical  screen.  The 
location  of  that  position  Is  displayed  to  the  end  user  by  means  of  a  mouse  cursor.  The 
shape  of  the  mouse  cursor  can  be  different  depending  upon  context  allowing 
processes  to  give  the  end  users  a  general  cue  as  to  the  activities  of  the  process. 

The  position  of  the  mouse  cursor  Is  maintained  by  the  window  manager  level  and  is 
available  to  the  client  upon  request  or  when  a  choice  event  (button  press)  occurs.  The 
client  can  also  move  the  mouse  cursor  to  a  desired  position  on  a  particular  window. 

Choice  events  are  always  directed  to  the  window  within  which  the  mouse  cursor  is 
located.  More  properly  to  the  process  which  owns  the  window.  When  windows  are 
overlapping  then  the  event  Is  possibly  directed  to  the  window  which  is  invisible  below 
the  current  window  (see  Section  4.4.4  for  a  discussion  of  window  ordering).  In  certain 
cases,  the  overlapping  windows  are  designed  to  support  a  single  cognitive  task  (a 
menu,  for  example)  and  in  this  case.  It  is  the  responsibility  of  the  top  window  to  pass  the 
event  on  to  underlying  windows. 
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4.4.1 .2.  Text  Cursor 

Some  windows  are  created  as  text  windows.  This  allows  them  to  receive  key  events. 
Within  these  windows  Is  an  additional  cursor,  the  text  cursor,  which  indicates  current 
keystroke  position. 

4.4.1 .3.  Current  Focus 

Keystrokes  are  assigned  to  the  window  that  Is  currently  the  user's  focus  of  attention. 
Two  models  exist  to  determine  the  current  focus; 

1.  Mouse  focus.  Keyboard  events  are  assigned  to  the  window  within  which 
mouse  cursor  Is  located. 

2.  Click  to  focus.  The  end  user  must  explicitly  assign  keyboard  to  a  window  by 
selecting  that  window  with  a  choice  event.  Keyboard  events  are  assigned  to 
that  window  unless  explicitly  changed  by  the  end  user  or  by  the  client. 

4.4.1 .4.  Cognitive  Aspects 

Both  models  for  assigning  keyboard  events  present  problems.  If  keyboard  events  are 
assigned  to  the  window  within  which  the  mouse  currently  resides,  end  users  can  shift 
their  focus  of  attention  and  forget  to  move  the  mouse  to  reflect  the  shift.  This  results  in 
input  being  directed  to  the  incorrect  process  (from  the  end  user's  perspective). 

The  same  problem  occurs  within  click  to  focus  systems.  The  end  user  can  shift  the  focus 
of  attention  without  performing  the  actions  required  to  inform  the  system  of  that  shift. 

One  method  that  systems  use  to  avoid  these  problems  is  to  give  the  end  user  cues 
which  Indicate  which  window  Is  currently  the  focus.  In  Figure  4.3,  the  window  In  the 
middle  right  Is  the  current  focus.  The  text  cursor  is  a  square  block  in  that  system.  The 
current  focus  is  indicated  In  two  ways.  First,  the  title  bar  for  the  window  in  the  current 
focus  Is  darkened  and  secondly,  the  text  cursor  Is  filled  In  within  the  current  window  and 
hollow  within  the  other  windows. 

Since  the  window  system  performs  actions  for  the  client  (resizing,  moving  windows, 
scrolling  windows)  certain  events  must  be  dedicated  to  specifying  these  actions.  These 
events  then  permeate  the  window  system  and  restrict  the  types  of  interactions  that  a 
client  can  specify.  For  example.  If  a  resize  Is  specified  with  the  right  button  of  the  mouse 
and  the  client  cannot  override  that  specification  then  the  right  button  is  unavailable  for 
the  client  to  use.  If  the  client  can  override  that  specification  then  resize  is  either 
unavailable  or  must  be  specified  in  a  different  fashion  depending  upon  which  window  Is 
to  be  resized.  This  problem  Is  called  the  button  overload  problem. 

One  technique  used  to  avoid  the  overloading  problem  is  to  utilize  the  title  bars  and 
scroll  bars  as  areas  where  window  manager  functions  are  specified.  If  the  mouse 
cursor  is  within  a  title  bar  or  a  scroll  bar  then  the  buttons  perform  one  task  and  if  they 
are  Inside  a  window  the  buttons  perform  other  tasks. 

4.4.2.  Output  Management 

The  window  manager  displays  multiple  virtual  screens  on  the  same  physical  screen. 
Typically,  all  of  the  active  virtual  screens  will  not  simultaneously  fit  on  the  physical 
screen.  This  leads  to  the  problem  of  the  arrangement  of  the  windows  on  the  physical 
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screen.  Figure  4.3  shows  seven  different  active  windows.  Two  of  these  (the  two  in  the 
iower  ieft  comer)  are  represented  by  icons.  Three  are  text  windows  (the  ones  in  the 
middle  of  the  screen)  and  two  are  graphic  windows  (in  the  upper  ieft  and  iower  right 
comers).  The  particuiar  arrangement  of  windows  obscures  portions  of  some  of  the 
active  windows.  The  Issues  Involved  In  output  management  are: 

1.  Window  placement 

2.  Management  of  obscured  windows 

3.  Hierarchy  of  windows 

4.  Graphic  contexts 

5.  Data  interchange  between  windows 

4.4.2.I.  Window  Placement — Overlapping 

One  strategy  for  the  placement  of  the  window  on  the  physical  screen  is  to  allow 
overlapping  windows.  This  Is  usually  associated  with  allowing  the  end  user  to  specify 
the  placement  of  windows.  The  client  generates  a  window  in  a  particular  location  and 
with  a  particular  size  and  the  user  then  has  the  ability  to  move  and  resize  the  window. 
The  user  also  has  the  ability  to  make  windows  visible. 

The  basis  for  managing  overlapping  windows  is  to- maintain  a  list  of  active  windows. 
Each  window  has  a  size  and  physical  location.  The  windows  are  placed  on  the  physical 
screen  in  the  reverse  order  of  the  list.  Those  windows  on  the  top  of  the  list  then  become 
the  ones  displayed  last  and,  consequently,  become  the  visible  windows. 

There  are  two  operations  available  to  manage  the  windows  on  the  list  (other  than  the 
create,  delete  oi^rations).  These  are:  move  to  top  of  list  and  move  to  bottom  of  list. 
Move  to  top  of  list  makes  a  particular  window  visible  and  move  to  bottom  of  list  removes 
a  particuiar  window  from  its  visibility  (assuming  there  are  windows  being  obscured  by 
the  particular  window).  The  window  system  has  a  mechanism  to  allow  the  end  user  to 
specify  those  two  types  of  events. 

The  window  system  also  has  a  mechanism  for  Iconifying  and  de-lconifying  a  window. 
The  Iconification  will  not  change  the  position  of  the  window  on  the  screen  but  will 
usually  cause  it  to  take  up  less  space  on  the  physical  screen  and  make  visible  other 
windows. 


4.4.2.2.  Window  Placement — Tiled 

A  tiled  window  manager  is  responsible  for  the  size  and  placement  of  the  individual 
windows.  The  rationale  for  such  systems  Is: 

1.  Screen  real  estate  can  be  more  efficiently  and  more  simply  managed  by 
the  system  than  by  the  end  user 

2.  If  the  end  user  can  only  see  a  portion  of  a  window  then  that  portion 
should  define  the  client’s  virtual  terminal  and  since  there  are  no 
obscured  windows,  the  problems  of  output  to  obscured  windows  do  not 
exist 
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Within  a  tiled  window  system,  each  client  defines  the  minimum  and  maximum  window 
size  for  a  virtual  terminal.  When  less  than  the  minimum  is  available,  the  process  is 
suspended.  The  output  from  a  process  Is  mapped  directly  into  the  available  virtual 
terminal. 

Tiled  window  systems  will  shift  the  location  and  size  of  a  window  when  new  windows 
are  created.  This  can  be  disconcerting  to  the  end  user.  Evidence  on  receptiveness  of 
end  users  to  tiled  window  systems  Is  mixed.  It  does  seem  clear  that  massive  and 
frequent  screen  reorganizations,  unless  user  initiated,  are  undesirable  [Bly  86]. 

4.4.3.  Management  of  Obscured  Windows 

Output  occurs  to  virtual  terminal  regardless  of  window  visibility.  Windows  are  also 
obscured  by  being  overlapped  by  other  windows.  This  leads  to  the  problem  of 
redrawing  the  window  which  is  newly  exposed.  There  are  two  techniques  for  dealing 
with  exposure  of  obscured  windows: 

1.  Generate  "exposed”  event  for  the  client  process.  This  places  the  client  in 
charge  of  redrawing  the  exposed  portion  of  the  window.  It  simplifies  the 
problem  of  the  window  manager  and  saves  window  manager  storage.  If 
the  window  manager  is  to  have  the  ability  to  redraw  each  virtual  terminal 
then  it  must  maintain  a  current  copy  of  each  window,  whether  visible  or 
not.  This  can  be  expensive  in  terms  of  memory. 

2.  Maintain  virtual  terminal  in  separate  buffer  which  is  then  mapped  onto 
screen.  Performance  considerations  dictate  that  a  separate  "frame 
buffer"  is  maintained  which  is  used  to  do  the  screen  mapping.  The 
separate  frame  buffer  limits  the  nurnber  of  virtual  terminals  which  can  be 
managed  in  this  fashion. 

4.4.4.  Hierarchies  of  Windows 

Up  to  this  point,  all  of  the  windows  were  assumed  to  be  bound  to  distinct  processes  and 
to  be  independent.  This  allows  one  window  to  be  repositioned  without  any  effect  on  the 
other  windows.  For  some  purposes,  windows  should  be  considered  to  be  related  and 
either  moved  together  or  constrained  not  to  be  moved  outside  a  particular  region.  Some 
examples  are: 

1.  Rgure  4.6  displays  a  menu.  The  items  of  the  menus  are,  in  fact,  windows 
all  residing  within  a  parent  menu.  Because  the  window  system  will 
determine  within  which  window  the  cursor  Is  located,  this  formulation  is 
more  convenient  for  the  client  than  treating  the  menu  items  as  the 
contents  of  a  single  window.  If  the  menu  items  are  treated  as  the 
contents  of  a  single  window  then  the  client  must  determine  which  item 
was  chosen  when  a  choose  event  occurs.  Using  the  parent,  child 
concept,  the  window  system  will  do  the  determination.  When  the  parent 
window  is  positioned,  all  of  the  items  of  the  window  should  be  positioned 
relative  to  the  parent  window. 

2.  Figure  4.5  displays  the  canvas,  viewport  concept.  An  easy  mechanism 
for  managing  this  relationship  is  the  parent  child.  The  way  it  is  done  is 
slightly  counter-intuitive  and  relies  on  the  fact  that  the  window  system 
clips  a  window  based  on  its  parent.  The  viewport  is  the  parent  window 
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and  the  canvas  is  the  child  window.  Then  the  portion  of  the  canvas  that 
is  visible  is  determined  by  the  clipping  mechanism  applied  to  the 
viewport  window.  Scrolling  is  accomplished  by  moving  the  canvas 
rather  than  the  viewport. 


Window  1 

Window  2 

Window  3 

Window  4 

Window  5 

Rgure  4.6:  Example  of  menu. 

Windows  can  be  specified  by  the  client  to  form  a  hierarchy.  Within  this  hierarchy, 
children  are  positioned  relative  to  the  parent.  The  children  can  be  moved 
independently  of  the  parent  but  the  calculation  of  their  position  on  the  screen  is  done  by 
first  determining  the  position  of  the  parent  and  then  the  position  of  the  child  within  the 
parent.  Children  are  clipped  based  on  their  parents.  Thus,  when  a  child  window  is 
moved  off  of  the  edge  of  the  parent,  only  a  portion  of  the  child  remains  visible. 

A  choice  event  or  a  key  event  is  directed  to  the  visible  window.  If  that  window  does  not 
wish  to  handle  the  event,  it  will  direct  it  to  its  parent,  and  so  on  up  the  hierarchy.  All 
windows  are  children  of  the  root  window  and  It  consumes  any  unwanted  event. 

The  hierarchy  notion  allows  many  complications.  Menus  have  already  been  discussed. 
Another  use  of  hierarchies  is  the  title  bar,  scroll  bar  concepts  that  have  been  discussed. 
The  parent  window  has  children  windows  which  rep^'esent  the  title  bar  and  the  scroll 
bars,  etc.  Again,  this  allows  the  window  system  to  determine  the  cursor  position  rather 
than  forcing  the  client  to  perform  the  determination.  It,  of  course.  Is  possible  for  the  client 
to  attach  Its  own  title  bar,  scroll  bars  to  the  window  and  use  different  mechanism  than 
the  window  mechanism. 

One  determining  factor  in  whether  children  windows  are  used  for  auxiliary  functions 
such  as  menus  and  title  bars  Is  the  performance  of  the  window  system.  Using  the 
window  manager  for  such  purposes  will  generate  several  hundred  windows  very 
quickly.  If  the  window  manager  is  efficient  enough  to  manage  a  large  number  of 
windows  then  the  window  abstraction  provides  for  a  very  attractive  solution  to  choice 
problems.  See  Section  5.2.3.3  for  a  discussion  of  facilities  for  manipulating  direct 
manipulation  user  interfaces. 
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4.4.5.  Graphic  Context 


The  graphic  context  defines  color  maps,  line  style  and  other  graphic  attributes  (Section 
5.3  gives  a  fuller  discussion  of  graphic  concepts).  Within  a  window  system,  each 
process  has  a  graphic  context  and  ^e  system  typically  changes  the  current  graphic 
context  whenever  the  window  focus  changes. 

4.4.6.  Data  Interchange  Across  Windows 

Since  multiple  windows  are  being  managed  by  the  same  window  manager,  it  becomes 
possible  to  transfer  Information  from  one  process  to  another  through  the  window 
manager.  This  "cut  and  paste"  facility  Is  implemented  by  retrieving  information  from  one 
window  (client  process)  at  the  level  of  abstraction  of  the  underlying  communication 
mechanism  (see  the  next  section)  and  communicating  that  information  to  a  second 
window  (client  process).  The  second  process  must  be  able  to  recognize  the  structure  of 
the  information  received  but  windowing  systems  automatically  have  a  ievei  of 
interchanging  data  from  one  process  to  another  which  is  at  a  higher  level  of  abstraction 
than  pure  bit  maps. 

4.5.  Networking  Considerations 

The  functionality  of  the  window  manager  can  be  implemented  in  a  variety  of  different 
manners.  The  possible  partitioning  of  the  functionality  are  [Gosling  86]; 

1.  Replicate  the  window  manager  functionality  in  the  address  space  of 
each  client  process. 

2.  install  the  window  manager  functionality  in  the  kernel  of  the  operating 
system,  outside  the  address  space  of  the  clients. 

3.  Have  a  separate  window  server  process  which  is  outside  both  the  kernel 
and  the  client  address  spaces. 

The  problem  with  the  first  option  (replication)  is  the  difficulty  in  multiple  processes 
accessing  the  same  window  since  the  window  Is  maintained  in  the  address  space  of 
the  process.  The  problem  with  the  second  option  (embed  in  kernel)  is  that  overloads 
the  functionality  of  the  kernel.  In  order  to  modify  the  window  manager  the  kernel  must 
be  modified  and  this  introduces  configuration  problems  on  most  systems.  The 
technique  being  used  by  most  window  systems  is  the  third  option.  The  client  processes 
are  considered  to  be  clients  of  a  single  server.  A  number  of  consequences  flow  from 
this  partitioning  of  the  functionality. 

4.5.1.  Communication 

Since  the  client  is  in  a  distinct  address  space  from  the  server,  they  must  communicate 
through  some  fixed  protocol.  The  fixed  protocol  uses  the  underlying  operating  system 
inter-process  communication  mechanism  and  performance  issues  become  Important. 
The  performance  of  inter  process  communication  mechanisms  depends  upon  the 
volume  of  traffic  sent  through  the  mechanism.  Within  window  systems,  the  protocol  for 
communication  Is  defined  at  a  higher  ievei  of  abstraction  than  bit  maps  in  order  to 
reduce  the  volume  of  traffic.  The  X  Window  system  [Scheifier  86]  has  commands  which 
"draw  circle"  or  "draw  line"  and  graphical  communication  is  handled  at  that  ievei  of 
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abstraction.  The  NeWS  system  [SUN  87]  sends  messages  which  carry  PostScript 
programs  [Adobe  85].  PostScript  Is  a  display  formatting  language  described  in  more 
detail  in  Section  5.3. 

This  ability  to  communicate  at  higher  levels  of  abstraction  Is  also  exploited  to  allow  the 
client  to  change  the  interpretation  of  key  or  choice  events.  The  use  of  PostScript  allows 
the  "downloading”  of  actual  programs  which  can  change  any  facet  of  the  window 
system  behavior. 

The  use  of  the  operating  system's  inter-process  communication  mechanism  means  that 
the  communication  between  the  client  and  the  process  is  asynchronous.  Order  of 
communications  in  one  direction  Is  maintained  but  the  sequencing  of  messages  Is  not. 
Each  window  could  have  a  collection  of  clients  and  each  client  could  have  a  variety  of 
different  windows  being  managed  by  the  server. 

4.5.2.  Networking 

The  use  of  the  operating  systems  Inter-process  communication  mechanisms  for 
communication  between  client  and  server  allows  the  client  and  server  to  be  distributed 
across  a  local  area  network.  Figure  4.7  displays  a  network  which  exploits  the  distinction 
between  clients  and  servers.  A  client  resides  on  one  workstation  and  can  have  a  server 
which  resides  on  a  different  physical  workstation  and  manages  a  different  physical 
terminal.  The  implications  of  that  type  of  structure  are  still  being  explored  for  various 
client  domains. 


Client  1 

% 

1 

client  2 


server  1 


u 


server  2 


Terminal  1 


Terminal  2 


Figure  4.7:  Network  of  servers. 


4.6.  Desirable  Features  of  Window  Systems 

A  number  of  the  items  discussed  are  important  features  in  the  evaluation  of  any  window 
system.  They  are: 

1.  Does  the  system  separate  basic  mechanisms  for  managing  windows 
from  the  policies  involved  In  the  management.  NeWS  or  the  X  window 
system,  for  example,  support  either  tiling  or  overlapped  windows.  It  is  the 
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responsibility  of  the  client  to  adopt  a  policy  and  the  window  system  will 
provide  the  mechanisms. 

2.  Does  the  system  provide  one  communication  channel  per  client 
process.  When  this  is  so  the  client  is  guaranteed  to  receive  events  in  the 
right  order.  If  there  were  one  communication  channel  per  window  then 
distinguishing  the  order  of  events  across  windows  becomes  difficult. 
Having  one  communication  channel  per  client  also  avoids  polling  by  the 
client  on  ail  of  the  channels  to  see  if  an  event  has  arrived. 

3.  Does  the  system  allow  the  definition  of  a  hierarchy  of  windows.  When 
using  a  direct  manipulation  Interface,  It  Is  important  to  be  able  to  handle 
object  overlapping.  Object  overlapping  is  easily  handled  within  a 
hierarchy  of  windows.  Movement  of  the  parent  will  move  the  entire 
object. 

4.  Does  the  system  provide  the  client  with  offscreen  bitmaps  (or  canvases 
with  the  same  graphics  operations  as  visible  windows.  If  the  client  needs 
to  distinguish  between  visible  and  obscured  windows  in  order  to  perform 
basic  operations  then  the  Interaction  between  the  client  and  the  window 
system  becomes  needlessly  complex.  Also,  the  offscreen  bitmap  acts  as 
a  cache  for  pixels  and  becomes  a  performance  enhancement 
mechanism. 

5.  Does  the  system  allow  the  clients  visiblity  Into  and  use  of  non  window 
management  facilities.  For  example,  communication  between  various 
clients  Is  greatly  simplified  if  the  window  systems  communication 
facilities  are  available. 

6.  Does  the  system  allow  the  clients  control  in  the  case  of  failures.  For 
example,  if  the  client  requests  an  unavailable  font  then  the  window 
system  should  have  a  well  defined,  consist  method  of  allowing  the 
clients  to  determine  strategy.  This  facility  is  important  in  the  building  of 
robust  systems. 

4.7.  Rooms 

A  particularly  interesting  user  interface  which  has  been  developed  on  top  of  a  window 
system  Is  Rooms  by  Card  and  Henderson  [Card  87].  It  is  an  example  of  how  cognitive 
studies  and  information  can  be  used  to  develop  better  user  interface  software. 

The  first  step  in  the  development  of  Rooms  was  to  analyze  the  way  in  which  people 
used  windows.  The  data  gathered  showed  that  people  used  windows  In  groups.  That  is, 
there  was  one  group  of  windows  In  which  there  was  activity  and  that  activity  was 
localized  In  that  one  group  and  then  activity  was  transferred  to  a  second  group  of 
windows  and  activity  was  localized  In  that  second  group  and  so  on.  The  pattern  of 
activities  supports  the  hypothesis  that  an  end  user  performs  one  task  at  a  time.  The 
windows  In  which  activity  was  localized  were  those  windows  which  supported  the 
particular  task  being  performed. 

The  second  step  was  the  realization  that  the  set  of  all  existing  windows  could  be 
collected  into  the  groups  within  which  activity  was  localized  and  that  these  groups 
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could  be  made  the  basis  for  a  system.  The  metaphor  of  rooms  in  a  building  and  the 
windows  within  each  room  was  used  as  the  basis  for  building  a  system. 

In  Rooms,  the  end  user  Is  provided  with  a  collection  of  rooms  in  a  building.  Examples 
might  be  the  mail  room  or  the  project  meeting  room.  Within  each  room  windows  could 
be  created  or  destroyed.  A  particular  room  Is  current  at  any  point  in  time  and  within  this 
room  ail  of  the  windows  are  exposed.  Rooms  which  are  not  current  (in  the  metaphor, 
rooms  in  which  there  are  no  occupants)  are  represented  as  Icons.  Thus,  when  a  user 
moves  from  one  room  to  another  (changes  tasks),  the  windows  In  the  room  being  exited 
become  unavailable  and  the  windows  in  the  room  being  entered  become  available. 

Each  room  is  given  a  different  background  so  the  user  can  tell  which  room  is  currently 
active  and  an  architectural  plan  of  the  building  Is  kept  available  so  that  the  user  can 
determine  how  to  navigate  from  one  room  to  another. 

There  are  a  number  of  additional  features  to  Rooms  (window  sharing  and  expanding 
upon  the  metaphor)  but  the  heart  of  the  system  came  from  the  realization  that  people 
used  windows  in  a  localized  manner  and  that  if  the  system  supported  this  localization 
then  windows  would  be  used  more  efficiently.  Pre  and  post  studies  showed  that  the 
typical  user  managed  about  three  times  as  many  windows  using  Rooms  than  using  a 
normal  window  manager.  Since  users  manage  as  many  windows  as  they  can 
comfortably  handle.  Rooms  Increased  the  number  of  windows  with  which  a  user  is 
comfortable.  Rooms  is  an  outstanding  example  of  the  connection  between 
understanding  the  cognitive  machine  of  the  end  user  and  the  requirements  of  software. 
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4.8.  Introduction  to  Toolkits 

The  level  of  abstraction  available  from  a  window  manager  is  really  too  low  for 
convenient  use  by  a  client  programmer.  The  client  receives  detailed  knowledge  of 
choice  events  (button  up  and  button  down  are  separate  events,  for  example)  and  the 
ability  to  determine  the  location  of  the  mouse  cursor  within  a  window.  The  client  also 
specifies  precisely  the  type  of  output  to  be  placed  within  a  window. 

At  a  higher  level  of  abstraction,  the  client  programmer  would  have  available  a  library  of 
Interaction  objects.  Each  with  its  own  geometry  and  behavior.  Such  things  as  command 
buttons,  dials,  sliders  could  be  used  to  interact  with  the  client  at  the  level  of  "object 
selected"  and  "value  set."  These  types  of  interactors  are  available  in  toolkits  and  are 
discussed  extensively  in  the  next  section. 
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5.  Toolkits 


Tools  for  Implementing  user  Interfaces  are  becoming  more  available.  Although  they 
aim  at  the  same  goal,  they  are  not  all  equivalent.  The  purpose  of  this  chapter  Is  to 
present  a  classification  that  organizes  the  space  of  existing  tools  into  classes.  Each 
class  is  characterized  by  the  level  of  services  it  offers  to  the  implementer.  Tools  for  the 
construction  of  user  Interfaces  range  from  the  low-level  toolkits  to  the  rriore  elaborate 
User  Interface  Management  Systems  (UIMS). 

A  brief  taxonomy  for  user  Interface  tools  Is  presented  In  the  first  section.  In  Section  5.2, 
attention  is  focussed  on  toolkits  per  se.  One  important  component  of  toolkits  Includes 
facilities  for  graphics.  This  topic  is  presented  in  the  last  section  of  the  chapter. 
Sophisticated  tools  known  as  User  Interface  Management  Systems  are  described  in 
Chapter  6. 

5.1.  A  Taxonomy  of  Tools  for  User  Interface 

As  shown  in  Figure  5.1,  tools  for  the  development  of  user  interfaces  come  in  two 
categories:  toolkits  and  User  Interfaces  Management  Systems  (UIMS). 


Figure  5.1 :  A  taxonomy  of  tools  for  the  construction  of  user  interfaces. 


A  toolkit  is  a  set  of  building  blocks  that  the  implementer  assembles  to  manufacture  a 
user  interface.  It  provides  the  programmer  with  a  wide  range  of  functions  from  the  low- 
level  management  of  the  wor1<statlon  such  as  windowing,  graphics,  sound  and  text 
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editing,  to  the  higher  level  of  dialogue  handling  in  the  form  of  menus,  buttons,  control 
panels,  etc. 

User  Interface  Management  Systems  come  in  two  forms:  user  interface  run  time  kernels 
and  user  Interface  environments.  A  user  Interface  run  time  kernel  is  a  skeleton  upon 
which  the  functional  components  of  applications  can  be  embedded.  A  user  interface 
environment  automatically  generates  a  user  Interface  from  the  specification  provided  by 
the  designer  and  link  the  Interface  to  the  application.  For  doing  so.  It  includes  a  run 
time  kernel  Into  which  the  application  and  the  "compiled"  specification  are  plugged. 

In  summary,  toolkits  provide  the  building  blocks,  run  time  kernels  package  the  code  that 
Implements  the  foundation  of  an  Interactive  system  Into  a  reusable  and  extensible 
skeleton,  and  a  user  Interface  environment  automatically  generates  the  specific  aspects 
of  a  user  Interface  from  high-level  specifications.  When  considering  the  ease  of 
construction,  the  level  of  service  Increases  from  toolkits  to  user  interface  environments. 

5.2.  Toolkits 


5.2.1.  Overview:  General  Services 

Figure  5.2  shows  a  classification  of  the  types  of  services  provided  by  any  toolkit.  These 
services  can  be  organized  In  two  categories:  services  related  to  the  management  of  the 
workstation  and  services  for  the  management  of  the  dialogue. 

Services  for  the  management  of  the  workstation  define  a  virtual  terminal  as  presented 
in  Chapter  4.  Abstractions  vary  from  one  toolkit  to  another,  but  they  usually  include: 

•  Foundations  for  graphics  (e.g.,  offscreen  bitmaps  or  canvas,  viewports, 
windows). 

•  Primitive  graphic  entities  (e.g..  Icons,  cursor  shapes). 

•  Elements  for  text  processing  (fonts),  and  sound. 

•  Support  for  event  handling. 

Services  for  the  management  of  the  dialogue  rely  on  the  abstractions  defined  for  the 
management  of  the  workstation.  They  Incl'  .de: 

•  Elementary  entitles  for  dialogue  handling  such  as  buttons  and  scrollbars. 

•  Compound  objects  such  as  menus  and  forms. 

In  addition,  recent  toolkits  such  as  X  Toolkit,  propose  a  model  and  a  general 
mechanism  for  building  special  purpose  dialogue  objects. 

For  some  toolkits,  such  as  the  Macintosh  Toolbox  [Rose  86],  workstation  management 
and  dialogue  management  are  gathered  in  a  single  library.  For  others,  the  distinction 
between  the  two  levels  of  services  is  more  explidt.  For  example,  in  the  X-Windows 
environment,  services  for  the  management  of  the  workstation  are  accessible  through  X- 
Lib  whereas  services  for  the  management  of  the  dialogue  are  gathered  In  X-Toolkit. 
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5.2.2.  Advantages  and  Drawbacks  of  Toolkits 


5.2.2.I.  Advantages 

As  for  any  library,  a  toolkit  Is  a  convenient  support  for  portability  and  flexibility.  Its  last 
advantage  is  specific  to  the  domain  to  which  it  applies  by  defining  a  consistent  style  of 
Interaction. 

1.  Portability.  Software  portability  is  one  of  these  practical  problems  that 
computer  scientists  face  continuously.  Knowing  that  the  user  interface 
part  of  an  Interactive  system  can  represent  up  to  80%  of  the  code,  the 
portability  of  user  interfaces  deserves  special  attention.  Toolkits  offer  a 
convenient  and  natural  way  for  defining  levels  of  portability. 

2.  Flexibility.  Software  flexibility  covers  issues  about  diversity  and 
extensibility.  Diversity  Is  concerned  with  the  availability  of  various  levels 
of  abstractions.  With  user  Interface  toolkits,  the  programmer  has  the 
choice  between  the  low-level  services  that  allow  him  to  control  the 
workstation  at  a  very  fine  grained  detail  and  high-level  services  that 
provide  him  with  ready  for  use  local  dialogues.  Extensibility  Is  the  ability 
to  add  new  features.  As  mentioned  earlier,  recent  toolkits  provide  the 
programmer  with  a  mechanism  for  building  new  interaction  techniques. 

Other  toolkits,  in  particular  those  integrated  to  an  object-oriented 
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environment,  encourage  software  reutilization  through  the  subclassing 
mechanism. 

3.  Consistent  style  of  interaction.  Toolkits  include  a  variety  of  interaction 
techniques  that  can  be  reused  from  one  application  to  another.  As  a 
result,  they  define  a  style  of  Interaction  with  which  the  user  can 
progressively  become  familiar.  In  addition,  the  behavior  of  the 
Interaction  techniques  has  been  determined  In  accordance  with 
ergonomics  principles.  For  example.  In  order  to  facilitate  the  evaluation 
stage,  a  button  displays  itself  in  reverse  video  as  it  is  visited  by  the 
mouse. 

The  ability  to  determine  the  arrangement  of  the  building  blocks  allows  the  implementer 
to  fuiiy  control  the  behavior  of  a  user  interface.  Unfortunately,  this  freedom  has  Its 
counterparts. 

5.2.2.2.  Drawbacks 

Toolkits  do  not  embed  any  software  architecture;  they  are  hard  to  use  and  they  lead  to 
duplication  of  efforts. 

1 .  Wrong  Software  Architecture.  A  library  does  not  embed  an  architecture, 
in  particular,  user  interface  toolkits  do  not  enforce  the  modular  distinction 
between  the  application  and  the  user  interface.  As  a  result,  toolkits  may  • 
lead  to  suspicious  software  architectures  where  the  expression  of  the 
user  interface  is  mixed  with  the  expression  of  domain  dependent 
functions.  Mixing  the  two  aspects  impedes  the  maintenance  of  the 
Interactive  system  and  does  not  make  it  possible  to  Iteratively  adjust  the 
user  interface. 

2.  Long  Learning  Phase.  As  Figure  5.3  demonstrates,  a  toolkit  is  a  big  bag 
of  functions.  Finding  the  right  arrangement  may  be  a  tremendous 
technical  barrier  specially  for  the  first  time  developer. 

3.  Duplication  of  Efforts.  Making  the  giue  must  be  carried  out  for  each 
interactive  system.  It  Is  not  surprising  then  that  a  strong  interest  has 
recently  emerged  for  run  time  kernels  that  provide  Implementers  with 
reusable  code  organized  in  a  ready  for  use  architecture.  This  facility  will 
be  further  described  in  Chapter  6. 
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Figure  5.3:  Toolkits  provide  the  implementer  with  a  set  of  building  blocks 
_ to  be  glued  together. _ 


5.2.3.  Comparative  Analysis 

Toolkits  differ  mainly  in  the  control  strategy  they  embed,  the  ability  to  allow  the 
programmer  and  the  user  to  overload  and  customize  presentation  policies,  and  facilities 
for  implementing  direct  manipulation  interfaces.  This  issues  are  successively 
developed  in  the  next  paragraphs. 

5.2.3.1.  Control  Strategy 

Protocols  for  acquiring  and  processing  events  have  a  strong  impact  on  the  control 
structure  of  a  system.  With  regard  to  user  Interface  toolkits,  there  are  two  types  of 
protocols  whether  or  not  the  control  strategy  Is  embedded  in  interaction  techniques. 

When  the  control  strategy  is  embedded,  the  Interaction  techniques  have  a  mechanism 
to  process  events.  This  mechanism  Is  automatically  activated  when  an  event  is  of 
interest  to  the  technique.  (A  technique  can  express  interests  for  classes  of  events  at 
any  time).  When  the  technique  has  completed  processing  an  event,  it  automatically 
calls  a  procedure  provided  by  the  client  program.  This  procedure  performs  some 
domain  dependent  computation.  If  no  callback  procedure  has  been  specified  for  the 
event  class,  there  Is  no  further  processing.  It  means  that  this  event  class  has  no  domain 
dependent  meaning.  X-Toolkit  widgets  and  NeWS  Interactive  objects  are  built 
according  to  this  policy. 

When  the  control  strategy  is  not  embedded  In  interaction  techniques,  the  processing 
sequence  has  to  be  specified  by  hand.  The  programmer  needs  to  explicitly  ask  each 
possible  interaction  techniques  whether  it  is  concerned  by  the  event.  If  so,  the 
programmer  chooses  one  of  the  possible  methods  attached  to  the  techniques.  The 
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technique  has  no  event  handier,  it  has  a  coiiection  of  methods  that  can  be  invoked.  The 
technique  is  not  an  agent  endowed  with  capabiiities  for  decision  making,  it  is  a  passive 
server.  The  Macintosh  Toolbox  is  based  on  a  non  embedded  controi  strategy. 

To  summarize,  the  embedded  control  strategy  automatically  performs  the  sequence  of 
actions  for  processing  events  and  client  programs  are  called  for  complementary 
processing.  At  the  opposite,  when  the  controi  strategy  is  not  embedded,  the 
programmer  is  in  charge  of  giuing  the  pieces  of  processing  together. 

5.2.3.2.  Overloading  and  Customizing  Interaction  Techniques 

A  consistent  style  of  interaction  is  a  desirable  feature.  However,  the  style  defined  by  a 
toolkit  cannot  be  expected  to  be  satisfactory  for  every  situation,  in  some  circumstances 
standard  behavior  needs  to  be  adjusted.  The  adjustment  can  be  performed  either  by 
the  programmer  or  by  the  user. 

Programmers  may  desire  to  modify  the  visibie  behavior  of  an  interactive  object  or  the 
internal  functional  behavior.  Tooikits  based  on  the  object-oriented  paradigm  such  as 
ones  in  the  Smalitaik-80  [Goidberg  84]  or  Loops  [Bobrow  83]  environments  encourage 
such  overloading:  the  programmer  defines  a  new  subclass  and  overloads  the  inherited 
methods  with  his  special  purpose  code.  Tooikits  such  as  the  Macintosh  Toolbox, 
although  they  claim  to  be  designed  according  to  the  object-oriented  paradigm,  make 
the  modification  much  harder,  hard  enough  to  be  discouragingi 

Users  may  want  to  customize  a  user  interface  without  getting  Involved  in  a 
programming  task.  The  type  of  customization  that  is  currentiy  feasibie  without 
programming  is  concerned  witlr  the  iexical  levei  oniy.  For  doing  so,  a  tooikit  must 
provide  an  external  permanent  representation  for  interaction  techniques.  Extemai, 
means  that  the  description  of  the  Interaction  technique  is  not  wired  in  the  code  of  the 
user  interface.  Permanent,  means  that  the  existence  of  the  representation  is  not  tied  up 
to  the  execution  of  the  interactive  system.  Fiies  provide  a  convenient  way  for 
maintaining  permanent  data.  Finaiiy,  the  extemai  representation  can  serve  as  input 
data  to  an  editor  which  aiiows  the  user  to  interactiveiy  customize  the  iexical  aspects  of 
the  interaction  techniques.  The  notion  of  resource  developed  for  the  Macintosh 
Toolbox  Is  an  excellent  illustration  of  how  lexical  customization  can  be  performed  by 
any  user. 


5.2.3.3.  Facilities  for  Implementing  Direct  Manipulation  Interfaces 

User  interfaces  based  on  the  direct  manipulation  metaphor  are  very  demanding  on  the 
software  side.  In  particular,  an  object  may,  as  a  whole,  be  constrained  to  follow  the 
movements  of  the  mouse  and,  as  a  part,  be  locally  edited  in  real  time. 

Mouse  tracking  requires  a  ioop  of  three  software  actions:  erase  the  object  from  its 
previous  location,  repair  the  surface  that  has  been  damaged,  and  draw  the  object  at  the 
new  location.  Current  tooikits  do  not  provide  much  support  for  satisfying  these 
requirements.  The  Macintosh  Toolbox  offers  the  notion  of  region  that  the  client  program 
can  drag  around  as  long  as  the  user  holds  the  mouse  button  down  (cf  primitive 
DragCareyRgn).  However,  this  local  facility,  although  very  convenient,  is  not  a  generai 
mechanism  to  deal  with  overlapping  objects.  X  Windows  with  its  recursive  notion  of 
overlapping  windows  offers  an  attractive  foundation  for  impiementing  overiapping 
objects. 


72 


CMU/SEI-89-TR-4 


Editing  part  of  an  object  is  a  second  heavy  requirement  on  software  programming. 
Objects  are  usually  compound  entities.  Sometimes,  they  are  treated  as  wholes  (as  in 
mouse  tracking)  and  sometimes  as  parts  (as  in  editing  tasks).  Graphics  tools  available 
in  user  interface  toolkits  either  do  not  have  any  facilities  for  encapsulation  or  they  have 
encapsulation  facilities  which  hide  access  to  the  parts.  In  the  first  case,  there  is  no  way 
to  consider  the  object  as  a  whole.  In  the  second  case,  there  is  no  way  to  edit  part  of  the 
object.  For  example,  pictures  and  regions  of  the  Macintosh  Toolbox,  and  GKS 
segments  are  like  graphics  macrocommands.  The  client  program  can  execute  them 
with  different  parameters  Involving  location,  rotation  and  scaling.  Pictures,  regions  and 
segments  are  mechanisms  for  encapsulation.  They  allow  for  the  definition  of  a 
graphics  object  from  elementary  graphics  primitives.  However,  if  the  client  program 
needs  to  modify  a  line  segment  of  the  object  as  the  user  moves  the  mouse,  the  picture, 
the  region  and  the  segment  do  not  allow  this.  The  picture,  region  and  segment  must  be 
destroyed  and  rebuilt  with  the  new  line  segmenti  The  following  paragraph  describes 
graphics  tools  that  are  more  appropriate  for  interactively  editing  graphics  objects. 


5.3.  Graphics  Tools  for  Abstract  Imaging 

Information  layout  can  be  viewed  as  a  sequence  of  transformations  from  internal 
domain  dependent  data  structures  to  actual  Images.  Information  acquisition  from  a 
selected  point  In  an  actual  image  to  some  internal  data  structure  is  the  reverse 
sequence  of  transformations.  This  subsection  presents  two  general  techniques  that 
automatically  perform  these  two  way  transformations.  The  first  category  focuses 
attention  on  structural  relationships  between  the  components  of  an  image.  The  second 
one  is  based  on  a  general  constraint  problem  solver  approach.  Before  describing  these 
techniques,  we  need  to  briefly  review  low-level  graphics  tools.  * 

5.3.1.  Low-level  Graphics  Tools 

Low-level  graphics  tools  such  as  CGI  [ISO  86b]  define  a  graphics  machine  for  drawing 
lines,  circles  etc.  In  a  graphics  space  coordinate.  Other  tools  such  as  PostScript  [Adobe 
85],  QuickDraw  [Rose  86]  and  GKS  [ISO  85]  include  a  simple  encapsulation 
mechanism.  They  respectively  propose  the  notions  of  path,  region/picture  and 
segment.  Although  encapsulation  is  a  convenient  way  for  grouping  logically  connected 
information,  it  is  not  adequate  for  Interactively  editing  parts  of  graphics  compound 
objects.  PostScript,  however,  deserves  additional  comments. 

PostScript  is  a  powerful  programming  language  that  has  the  ability  to  describe  the 
appearance  of  any  type  of  Information  on  a  rendition  surface  (paper  or  screen).  Its 
power  Is  Turing  equivalent;  the  syntax  incorporates  a  postfix  notation  and  the  data 
model  Includes,  like  LISP,  the  ability  to  treat  programs  as  data.  PostScript  imaging 
model  Is  very  general  and  very  simple.  Figure  5.4  illustrates  the  model.  Imaging  is 
based  on  a  stencll/paint  model.  A  stencil  is  an  outline  specified  by  an  Infinitely  thin 
boundary  that  Is  piecewise  composed  of  spline  curves.  Paint  is  some  pure  color  or 
texture  or  even  an  Image  which  is  be  dropped  on  the  drawing  surface  through  the 
stencil.  PostScript  has  been  extended  to  serve  as  the  programming  interface  for 
NeWS:  client  programs  are  not  limited  to  a  predetermined  set  of  requests  but  they  can 
download  PostScript  programs  to  the  NeWS  server. 
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Figure  5.5  gives  an  overview  of  the  level  of  abstractions  of  graphics  tools. 


Client 

Program 


ABSTRACT  IMAGE  MACHINES  1 
ex:  Boites,  PHIGS,  Thing  Lab 


REAL  IMAGE  MACHINES  LEVEL  1 
ex:  GKS 


REAL  IMAGE  MACHINES  LEVEL  0 
ex:  QuickDraw,  Postscript 


Figure  5.5:  How  graphics  tools  relate  to  each 
other  with  regard  to  their  level  of  abstraction. 


5.3.2.  Abstract  Imaging  and  Structural  Relationship  » 

As  described  in  Chapter  3,  an  abstract  image  is  an  intermediary  data  structure  between 
structures  maintained  by  the  application  program  and  the  actual  Image  on  a  rendition 
surface,  it  shortens  the  distance  between  the  representation  convenient  for  the 
application  program  and  the  representation  required  by  windowing  systems.  Its 
purpose  is  to  express  logical  relationships  maintained  in  the  application  data  structures 
Into  graphic  relations.  The  goal  Is  not  to  express  all  of  the  logical  relationships  but  the 
relationships  that  help  the  user  perform  the  execution  and  the  evaluation  stages.  One 
important  class  of  relations  is  the  structural  relationship.  A  number  of  tools  based  on 
the  notion  of  box  and  the  graphics  ISO  standard  PHIGS  propose  abstract  imaging 
around  the  notion  of  structure. 

5.3.2.I.  Box-Based  Abstract  Imaging 

The  notion  of  box  has  first  been  used  for  TgX  [Knuth  79]  for  output  rendition  only.  Since 
then,  the  notion  of  box  has  been  extended  by  a  number  of  tools  [Mikelsons  81,  Coutaz 
85a,  Coutaz  85b,  Aihers  86,  Quint  87]  to  consider  inputs  as  well. 

The  box  as  described  in  [Coutaz  85a  and  Coutaz  85b]  is  a  tree-like  structure.  A  tree 
facilitates  the  definition  of  an  inheritance  and  a  synthesis  mechanisms  for  computing 
attributes.  Attributes  decorate  nodes  to  express  spatiai  reiations  (such  as  alignment 
and  indentation),  visual  effects  (such  as  highlighting  and  coloring),  polymorphism  (such 
as  elision),  and  links  to  application  dependent  data  structures.  Leaves  contain 
dispiayable  application  dependent  information.  They  are  recipients.  They  do  not  have 
any  semantic  knowledge  about  their  content  but  its  type  (e.g.  image,  text).  As  a 
recipient,  a  leaf  wraps  an  Imaginary  rectangle  around  the  information.  Nodes  are 
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compound  boxes.  A  compound  box  is  the  result  of  a  formatting  composition  from 
subtrees. 

Figure  5.6  shows  one  possible  tree  of  boxes  that  corresponds  to  an  "if  statement" 
maintained  by  a  syntactic  editor. 


The  formatting  attributes  HorV  first  tries  to  concatenate  the  subtree  horizontally.  If  the 
resulting  rectangle  is  too  wide  to  fit  the  available  width  of  the  rendition  surface,  a 
vertical  composition  is  applied  automatically.  The  attribute  H  concatenates  the 
subtrees  horizontally.  Hind  specifies  the  value  of  the  horizontal  indentation  if  one  has 
to  be  performed. 


If  {Cond}  Then  {Stmt}  Else  {Stmt}  endlf 


Figure  5.7:  Layout  in  a  wide  enough  window. 


The  interpretation  of  the  tree  will  generate  the  actual  images  shown  in  Figures  5.7  and 
5.8  depending  on  the  effective  width  of  the  output  window.  Note  that  when  the  user 
resizes  the  window,  the  new  formatting  is  automatically  handled  by  the  abstract  Image 
interpreter.  The  application  is  not  bothered  by  syntactic  user  actions  that  are  irrelevant 
to  its  expertise. 
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If  {Cond} 

Then 

{Stmt} 

Else 

{Stmt} 

endlf 

Figure  5.8;  Layout  in  a  too  narrow  window. 


5.3.2.2.  PHIGS 

PHIGS  [ISO  86a]  is  a  standard  for  graphics  which  takes  GKS  as  a  point  of  departure. 
However,  the  static  notion  of  segment  has  been  repiaced  by  the  editabie  notion  of 
structure.  Figure  5.9  shows  an  exampie  of  a  structure  definition.  The  interpretation  of 
the  request  POST_STRUCTURE(A)  executes  the  definition  of  A.  The  definition  of  A  is 
comprised  of  graphics  eiements  inciuded  between  the  requests 
OPEN_STRUCTURE(A)  and  CLOSE_STRUCTURE.  The  eiement 
EXECUTE_STRUCTURE  behaves  just  like  a  procedure  cail:  it  saves  the  current 
context,  deviates  to  a  new  context  and  comes  back  to  the  caiiing  context. 
EXECUTE_STRUCTURE(B)  saves  the  current  graphics  context  about  A,  interprets  the 
definition  of  B  and,  once  B  has  been  made  part  of  A,  returns  to  the  execution  of  A.  For 
inputs,  PHIGS  uses  an  extension  of  the  GKS  notion  of  logical  units  to  take  into  account 
the  structural  organization.  In  particular,  a  PICK  returns  a  path  which  uniquely  denotes 
the  selected  element. 
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In  contrast  to  GKS  segments,  PHIGS  structures  can  be  dynamically  modified.  The 
model  for  modification  is  Inspired  from  line  text  editors.  Figure  5.10  shows  an  example 
of  a  structure  edition.  As  for  text  editors,  you  first  need  to  open  the  recipient: 
OPEN_STRUCTURE(MYHOUSE)  opens  the  structure  MYHOUSE.  By  doing  so,  the 
interpreter  places  the  Insertion  point  at  the  end  of  the  structure  definition  and  sets  itself 
in  Input  mode.  This  means  that  subsequent  graphics  elements  will  be  automatically 
added  at  the  end  of  the  current  structure.  If  the  client  program  needs  to  delete  the 
window  element,  then  a  DELETE_ELEMENT(MYWINDOW)  will  do  the  job.  The  LABEL 
(MYWINDOW)  Is  a  symbolic  way  of  denoting  a  graphics  element,  just  like  a  line  number 
designates  text  lines  in  line  based  text  editors.  Similariy,  if  one  wants  to  replace  the 
definition  of  the  door,  then  the  insertion  point  can  be  set  at  the  appropriate  point  in  the 
structure  definition  and  the  replace  mode  will  substitute  old  graphics  elements  by  new 
ones. 
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initial  Definition  of  liYHOUSE  Editina  MYHOUSE 

OPEN-STRUCTURE  (MYHOUSE)  OPEN-STRUCTURE  (MYHOUSE) 

OELETE_ELEMENT(MYWINOOV/) 

LABEL  (MYWINDOW)  SET_ELEMENT_POINTER(MYDOOR) 

SET_EDiT_MODE  (REPLACE) 

LABEL  (MYDOOR) 

CLOSE-STRUCTURE  SET_EDIT_MODE(  INSERT) 

CLOSE-STRUCTURE 

Rgure  5.10:  A  PHiGS  structure  can  be  dynamically  edited. 

5.3.3.  Constraint-Based  Imaging 

A  constraint  describes  a  reiation  which  must  aiways  be  satisfied.  The  set  of  reiations 
maintained  in  the  abstract  image  machines  presented  in  Paragraph  5.3.2  is  iimited  in 
scope.  More  generai  mechanisms  for  expressing  any  type  of  graphics  constraints  need 
to  be  deveioped.  ThingLab  [Boming  86],  aithough  its  goai  is  not  abstract  imaging,  is  an 
interesting  iliustration  of  a  graphics  constraint  soiver. 

ThingLab  is  an  interactive  environment  buiit  on  top  of  Smaiitalk-80.  it  aiiows  a  user  to  ' 
spedfy  constraints  between  graphics  objects.  At  the  opposite  of  the  box  mechanism, 
these  constraints  are  not  restricted  to  a  predetermined  set.  A  ThingLab  constraint  is 
comprised  of  a  predicate  and  one  or  severai  methods.  The  predicate  is  an  aigebraic 
expression  which  is  used  for  constraint  checking.  The  methods  modify  the  entities 
referenced  in  the  predicate  in  order  to  guarantee  visuai  consistency.  The  power  of 
ThingLab  is  that  these  methods  are  automaticaliy  generated  from  the  specification  of 
the  predicate.  Figure  5.1 1  shows  an  example. 
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Figure  5.11:  Principles  of  constraint  specifications  in  ThingLab. 

The  bottom  window  contains  the  object  MyBar  as  it  will  appear  at  runtime.  The  upper 
window  gathers  the  usual  Smaittalk  browser  menus  which  allow  the  user  to  define  an 
algebraic  expression,  Identify  the  constants,  the  variables,  and  indicate  which  class  is 
reused  to  build  the  new  object  (currently,  the  rectangle  class  Is  appropriate  to  construct 
MyBar).  The  middle  window  is  the  wortehop.  The  goal  is  to  define  a  vertical  bar  to 
represent  an  Integer  n  comprised  between  0  and  100.  The  algebraic  expression 
defines  the  height  of  the  rectangle  where:  hi  and  h2  are  respectively  the  top  left  and 
bottom  right  comers  of  the  rectangle;  pi  and  p2  are  two  constant  points  such  that  the 
length  of  the  segment  [p1p2]  determines  the  height  of  the  rectangle  when  n  is  100;  h-|y 
and  h2y  denote  the  vertical  coordinates  of  HI  and  h2. 

ThingLab  has  served  as  a  basis  for  the  implementation  of  more  specialized 
environments:  Animus  [Duisberg  86],  which  introduces  the  notion  of  time,  and  the  Filter 
Browser  [Ege  87]  for  the  specification  of  user  interfaces. 
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6.  User  Interface  Management  Systems  (UIMS) 


6.1.  User  Interface  Runtime  Kernels 


6.1.1.  Introduction 

Toolkits  provide  components  with  which  it  is  possible  to  construct  a  user  interface. 
Each  component  is  specific  to  a  particuiar  information  presentation  or  acquisition  task. 
A  complete  interface,  however,  must  contain  muitipie  components  which  act  together  to 
convey  information  to  and  from  the  functional  portion  of  the  interactive  system. 

A  user  interface  runtime  kernei  is  a  skeieton  or  a  packaging  of  the  tools  in  a  toolkit  to 
provide  a  coiiection  and  a  sequencing  mechanism  for  the  toois  and  a  communication 
mechanism  for  information  to  and  from  the  functional  portion.  The  issues  involved  in 
the  runtime  kernei  are: 

1.  The  software  structure  used  in  the  runtime  kernel,  in  particuiar,  the 
architectural  model  underlying  the  software  and  the  interface  between 
the  particuiar  components  of  the  architectural  model. 

2.  Threads  of  control. 

« 

3.  The  model  used  to  describe  the  Interactions  between  the  end  user  and 
the  functional  portion.  This  is  usually  called  the  dialogue  model. 

4.  The  management  of  muitipie  views  of  the  same  application  data 
instance. 

5.  Feedback  issues. 

These  issues  are  discussed  in  the  sections  that  follow. 

6.1.2.  Software  Structure 

There  is  general  agreement  that  a  complete  interactive  application  can  be  partitioned 
into  three  components  [Pfaff  85].  These  three  components  are  the  functional  core  of  the 
application,  the  user  interface  runtime  kernei  and  the  lower  ievei  presentation  layer. 
Each  component  can  be  implemented  using  whatever  toois  are  available,  in  Chapters 
4  and  5.  the  presentation  layer  has  been  discussed  in  terms  of  window  systems  and 
toolboxes.  In  this  section  some  of  the  structural  issues  associated  with  the  runtime 
kernel  are  discussed.  In  particuiar,  a  method  for  dealing  with  the  interfaces  between 
the  layers  based  on  the  Serpent  UIMS  [Bass  88]  and  a  method  for  using  an  object- 
oriented  decomposition  of  the  runtime  structure  based  on  the  PAC  model  [Coutaz  87a, 
87b]  are  discussed. 

Figure  6.1  gives  a  high-level  view  of  the  components  of  an  interactive  application.  The 
application  component  consists  of  the  functional  core  and  a  communication  portion  with 
the  user  interface.  The  objects  in  this  communication  portion  are  at  the  ievei  of 
abstraction  of  the  application  and  have  no  presentation  components.  The  user 
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interface  has  two  portions:  the  presentation  components  and  the  dialogue  controller. 
The  objects  In  the  presentation  component  are  presentation  objects  and  have  no 
application  knowledge  within  them.  The  behavior  of  these  objects  must  convey 
application  semantics  but  the  objects  themselves  have  no  application  knowledge.  The 
dialogue  controller  performs  the  mapping  between  the  application  objects  and  the 
presentation  objects.  Application  domain  knowledge  can  be  embedded  into  the 
dialogue  controller  to  perform  the  mappings  or  can  be  restricted  to  the  functional  core. 
These  decisions  depend  upon  the  particular  circumstances  of  the  application. 


Note  that  the  mapping  between  the.  application  objects  and  the  presentation  objects  is 
bidirectional.  End  user  actions  will  both  modify  the  application  objects  and  provide 
commands  to  the  application  core  to  perform  its  functions.  Also,  the  mapping  Is  not 
necessarily  one  to  one.  Suppose  the  display  shows  a  fluid  boiling.  The  application 
has  one  object  which  represents  temperature  and  another  which  represents  pressure. 
The  boiling  point  depends  upon  both.  The  dialogue  controller  must  combine  the  two 
application  objects  Into  a  single  presentation  object.  This  is  an  example  of  a  situation 
where  the  dialogue  controller  has  application  domain  knowledge. 


6.1.3.  Serpent  Component  l  iterface  Management 

Serpent  is  an  example  of  such  a  runtime  kernel  and  will  be  used  to  explain  the 
concepts  in  more  detail.  An  application  using  Serpent  has  explicitly  three  components. 
These  are:  the  application  functional  core,  the  runtime  kernel  and  the  presentation 
level.  The  presentation  level  Is  composed  of  an  X  toolkit  component  and  other 
components  which  use  different  technologies  for  input  and  output  (e.g.  video  output  and 
gesturing  input).  Serpent  is  designed  to  allow  for  easy  integration  of  additional 
interaction  mechanisms  and  explicit  separation  between  the  application  functional  core 
and  the  runtime  kernel.  The  integration  of  additional  interaction  mechanisms  is 
accomplished  by  having  an  explicit  separation  between  the  presentation  layer  and  the 
dialogue  manager.  The  Interface  between  the  layers  allows  for  different  presentation 
layers  with  only  a  modification  of  the  dialogue  manager  and  no  modification  of  the 
application. 
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The  separation  between  the  components  is  accomplished  by  providing  an  explicit 
interface  description.  On  one  side  of  the  interface  is  the  Serpent  runtime  kernel.  On  the 
other  is  either  the  functional  core  of  the  application  or  the  presentation  layer.  Figure  6.2 
displays  this  structure. 

The  application  and  the  presentation  layer  view  the  Serpent  runtime  kernel  as  an  active 
data  base  manager.  The  application  views  Serpent  as  a  manager  of  data  of  which  the 
end  user  might  be  Interested  and  the  presentation  layer  views  Serpent  as  the  manager 
of  data  which  control  their  presentation  and  Interactions.  In  either  case,  there  is  an 
explicit  specification  of  the  data  which  Is  to  go  through  the  Serpent  runtime  kernel.  This 
specification  takes  the  form  of  a  schema  which  Is  similar  in  form  to  a  schema  for  a 
traditional  data  base  system. 

Whenever  the  application  modifies  a  data  item  in  the  data  base  managed  by  Serpent 
then  the  runtime  kernel  of  Serpent  manages  all  of  the  implications  of  that.  When  the 
end  user  performs  an  action  which  affects  a  data  item  in  the  data  base  which  Serpent 
manages  for  the  application  then  the  application  is  informed  of  the  change. 

The  schema  which  defines  the  form  of  the  data  to  pass  over  the  interface  is  processed 
prior  to  Serpent  runtime.  The  processor  produces  a  C  header  file  (or  Ada  package)  for 
the  application  to  Include.  This  guarantees  that  both  sides  of  the  Interface  have  the 
same  data  description  and,  consequently,  helps  insure  the  integrity  of  the  data  which 
crosses  the  interface. 


application 

application 

portion . 

portion 

(media  independent) 


Ada 


other 

I/O 

technology 


X  window 


Figure  6.2;  Serpent  architecture. 
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The  use  of  a  schema  to  define  the  data  that  Serpent  manages  allows  Serpent  to  be 
reusable.  Data  of  arbitrary  complexity  can  be  described  in  terms  of  the  schema 
description  used  in  the  interface  and.  consequently,  additional  Interaction  mechanisms 
can  be  added  and  arbitrary  applications  can  use  Serpent. 


6.1.4.  Threads  of  Control 

One  motivation  for  the  Rooms  system  (Section  4.7)  Is  the  end  user's  desire  to  move 
from  one  task  to  another,  whether  the  current  task  Is  completed.  The  dialogue 
controller  must  be  able  to  maintain  the  context  for  the  Interrupted  task  and  restore  it 
when  that  task  Is  to  be  resurhed.  This  is  one  example  of  having  multiple  threads  of 
control  within  a  dialogue.  Another  example  is  the  simultaneous  use  of  multiple  Input 
and  output  devices.  Some  types  of  Interaction  require  two  handed  Input  utilizing 
different  devices  [Buxton  86a].  if  the  devices  are  not  integrated  at  the  presentation  level 
then  the  dialogue  manager  must  simultaneously  process  the  input  from  both  devices, 
coordinate  it  and  determine  the  mapping  into  desired  application  actions.  In  the 
Macintosh  toolkit,  for  example,  this  type  of  activity  must  be  performed  In  the  top  level 
controller  and  cannot  be  pushed  Into  the  presentation  level. 

In  either  case,  the  requirements  Imposed  on  the  dialogue  manager  by  both  the  end 
user  task  switching  and  the  multiple  simultaneous  devices  mean  that  the  dialogue 
manager  must  support  parallelism. 


6.1.5.  The  Model  Used  to  Describe  User  Interactions 

A  number  of  different  models  have  been  used  to  describe  (and  hence  to  specify)  the 
user  Interactions.  These  models  are: 

1.  Formal  grammar  models,  in  particular  BNF 

2.  Rnite  state  machines,  usually  augmented 

3.  Production  or  event  models 

4.  Object-oriented  models 

Any  Implementation  of  these  models  has  two  portions.  First  is  a  language  for 
describing  interactions  in  terms  of  the  model.  A  program  in  this  anguage  tecomes  a 
specification  of  the  behavior  of  the  runtime  kernel.  The  second  portion  of  the  model  Is 
the  runtime  interpretation  of  the  specification.  An  implementation  decision  is  whether 
the  specification  language  is  compiled  Into  a  lower  level  description  or  Is  directly 
interpreted. 

6.1. 5.1.  Formal  Grammar  Models 

An  early  system,  SYNGRAPH  [Olsen  83],  used  BNF  to  specify  the  user  interactions. 
Each  non  terminal  In  the  BNF  had  an  associated  action  routine  which  describes  the 
presentation  and  the  actions  associated  with  the  presentation.  A  legal  interaction  is 
one  which  can  be  parsed  through  the  BNF.  BNF,  by  its  nature,  has  an  explicit  legal 
sequence  of  ordering  of  events.  This  Imposes  a  particular  style  upon  the  Interfaces 
specified  using  BNF.  For  example,  suppose  different  parameter  orderings  are  allowed. 
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Different  BNF  rules  must  be  used  to  specify  each  ordering.  Therefore,  in  order  to  allow 
the  end  user  to  choose  an  ordering  at  runtime  multiple  sets  of  BNF  rules  must  be 
spedfied. 

Furthermore,  since  all  actions  In  BNF  must  be  explicitly  stated,  allowing  a  user  to 
change  the  current  task  In  the  middle  (which,  as  has  been  described  in  Section  6.3  is  a 
desirable  feature),  spedfying  complete  interactions  using  BNF  is  a  formidable  chore. 

6.1. 5.2.  Transition  Networks 

An  alternative  to  BNF  as  a  spedfication  model  Is  to  use  a  finite  state  machine.  The  finite 
state  machine  is  typically  augmented  to  allow  a  richer  description  mechanism  than  finite 
state  automata.  USE  [Wasserman  85]  is  an  example  of  such  a  system.  Finite  state 
machines  suffer  from  the  same  sequendng  problems  as  BNF.  An  additional  problem 
that  both  spedfication  techniques  suffer  from  is  lack  of  model  support  for  levels  of 
abstraction. 

The  spedfication  of  a  selection  of  an  object  (cursor  over  object,  button  click)  is  one  level 
of  abstraction,  the  specification  of  the  ordering  of  parameters  to  a  command  is  a  higher 
level.  A  transition  network  does  not  distinguish  between  these  levels  of  abstraction 
and,  consequently,  a  spedfication  using  a  transition  network  becomes  difficult  to  code 
and  decipher. 

Some  extensions  to  transition  networks  allow  the  nesting  of  transitions  In  an  attempt  to 
support  the  different  levels  of  abstraction  [Kleras  85,  Harel  87]. 

6.1 .5.3.  Production  Model 

Production  models  are  collections  of  rules  of  the  form  if  "firing  nile"  then  "action". 
Productions  are  data  driven  in  the  sense  that  the  rules  are  fired  when  the  firing  rules  are 
satisfied  and  no  particular  sequencing  constraints  are  placed  on  the  firing  rules. 
Production  rules  [Garrett  82,  Hill  87a,  Hill  87b,  Brownston  85]  have  been  used  recently 
to  attempt  to  spedfy.the  parallelism  that  end  users  seem  to  require.  The  CLG  [Moran 
81]  Is  also  a  use  of  the  concepts  of  production  models  for  describing  the  Interaction 
level  although  not  explidtiy  discussed. 

The  Serpent  model  for  dialogue  uses  "view  controllers"  to  specify  the  mapping  between 
the  application  objects  and  the  presentation  objects.  Each  view  controller  has  a 
creation  condition  which  corresponds  to  the  firing  rule.  The  creation  condition  Is  a 
condition  on  the  application  objects  or  on  local  objects.  Local  objects  are  maintained 
for  dialogue  control  purposes  only  and  are  not  visible  to  either  the  application  or  the 
presentation.  Each  view  controller  controls  a  collection  of  presentation  objects.  The 
methods  of  these  presentation  objects  perform  the  reverse  mapping  from  the 
presentation  layer  to  the  application.  View  controllers  can  be  nested  and  the  lower 
levels  Inherit  the  application  objects  which  created  the  parent  levels.  The  use  of 
production  rules  solves  the  explicit  ordering  problems  associated  with  transition 
networks  and  BNF  grammars.  On  the  other  hand,  there  is  still  no  model  support  for 
levels  of  abstraction.  The  support  for  levels  of  abstraction  comes  from  the  structural 
ideas  of  PAG  or  the  nested  objects  used  in  the  production  model  of  Serpent. 

Systems  based  on  production  rules  suffer  from  several  problems.  Since  control  is  not 
explicitly  transferred  within  the  spedfication  of  the  diaiogue,  the  system  must  monitor  a 
large  data  space  In  order  to  dedde  which  rules  to  fire.  This  monitoring  of  a  large  data 
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space  may  lead  to  performance  problems.  The  appearance  of  more  efficient  production 
systems  [Forgy  84]  has  reduced  the  magnitude  of  this  problem.  Preliminary  indications 
are  that  performance  within  Serpent  (which  uses  OPS83)  Is  driven  by  the  performance 
of  the  presentation  layer  and  not  by  the  production  manager. 

A  second  problem  associated  with  the  use  of  a  production  rule  model  is,  precisely,  the 
lack  of  explicit  transfer  of  control.  Programmers  are  taught  to  think  of  algorithms 
sequentially  and  the  data  driven  nature  of  production  models  requires  a  heavily  parallel 
method  of  thinking.  This  Is  a  problem  that  can  be  overcome  with  training  and  if 
production  rule  systems  prove  to  be  suitably  useful,  then  programmers  will  be  taught 
earlier  to  think  in  terms  of  parallel  solutions  to  problems. 

6.1. 5.4.  Object-Oriented  Model 

A  different  approach  to  the  specification  of  the  mapping  from  application  objects  to 
presentation  objects  is  to  use  an  object-oriented  approach.  This  approach  underlies 
the  PAG  model  [Coutaz  87bj. 

In  the  PAG  model,  an  interactive  application  is  comprised  of  three  parts:  Presentation, 
Abstraction  and  Gontrol. 

The  Presentation  defines  the  concrete  syntax  of  the  application,  I.e.,  the  input  and 
output  behaviour  of  the  application  as  perceived  by  the  user.  The  Abstraction  part 
corresponds  to  the  semantics  of  the  application.  It  implements  the  functions  that  the 
application  Is  able  to  perform.  The  Gontrol  part  maintains  the  mapping  and  the 
consistency  between  the  abstract  entities  Involved  In  the  Interaction  and  implemented 
in  the  Abstract  part,  and  their  presentation  to  the  user.  It  embodies  the  boundary 
between  semantics  and  syntax. 

For  example,  the  application  ”Glock”  implements  and  involves  two  abstract  entitles  In 
the  dialogue:  the  data  structure  "Time"  and  the  function  "SetTime".  "Time"  may  be 
presented  as  a  digital  or  a  dial  clock,  SetTime  may  be  explicitly  presented  as  a  button 
or  implicitly  presented  through  the  direct  manipulation  of  the  needles  of  the  dial  clock. 
The  job  of  the  Gontrol  part  Is  to  Invoke  SetTime  on  specific  user's  actions  and  provoke 
the  update  of  the  dial  clock  when  the  application  (I.e  the  Abstract  part)  makes  a  request. 

The  Presentation  of  an  application  Is  Implemented  with  a  set  of  entitles,  called 
Interactive  objects,  specialized  for  man-machine  communication.  As  with  applications, 
an  interactive  object  is  organized  according  to  the  PAG  model.  Gonsider  for  example 
the  pie  chart  shown  In  the  Figure  6.3. 

1.  The  Presentation  is  comprised  of: 

•  for  output— a  circular  shape  and  a  color  for  each  piece  of  the 
pie. 

•  for  Input — the  mouse  actions  that  the  user  can  perform  to 
interactively  change  the  relative  size  of  the  pieces. 

2.  The  Abstraction  Is  comprised  of  an  Integer  value  within  the  range  of  two 
integer  limits. 
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3.  The  Control  maintains  the  consistency  between  the  Presentation  and 
the  Abstraction.  For  example,  if  the  user  modifies  the  size  of  one  piece, 
Control  provokes  the  update  of  the  integer  value.  Conversely,  if  the 
application  or  another  Interactive  object  modifies  the  value  of  the 
Integer,  the  size  of  the  pieces  is  automatically  adjusted. 


Abstraction 

^  Control 

Min  =  0 

Max  =  400  - T  '] 

Value  =  50  ^ ^ 

Pres€ 

mtatlon 

Figure  6.3:  An  elementary  PAC  interactive  object. 

Compound  objects  can  be  built  from  elementary  interactive  objects.  They  also  adhere 
to  the  PAC  model.  Consider,  for  example,  the  super  pie  chart  shown  in  the  Figure  6.4.  It 
is  made  from  two  elementary  objects:  the  pie  chart  described  above  and  a  numerical 
string  which  shows  the  current  abstract  value  of  the  pie  chart.  If  Control  C  receives  a 
message  notifying  him  of  the  modification  of  the  abstract  value.  It  notifies  both  C1  and 
C2  of  the  alteration.  Conversely,  if  the  user  changes  the  size  of  a  piece  of  the  pie  with 
the  mouse,  C1  reflects  the  modification  to  C  who,  in  turn  notifies  C2. 
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Rgure  6.4:  A  compound  PAC  interactive  object. 


In  summary,  by  applying  PAC  recursively  at  every  level  of  abstraction  of  the  user 
interface,  everything  In  an  Interactive  application  Is  a  PAC  object,  from  the  elementary 
Interactive  object  to  the  whole  application.  As  shown  In  the  upper  rectangle  of  Figure 
6.5,  the  whole  interactive  application  is  a  PAC  entity.  The  Abstraction  part  of  the 
application  involves  three  domain  dependent  concepts  in  the  dialogue.  The  Controller 
at  the  top  of  the  hierarchy  bridges  the  gap  between  the  Abstraction  and  the 


CMU/SEI-89-TR-4 


87 


Presentation.  The  Presentation  is  made  of  4  interactive  objects.  The  second  iower 
rectangie  shows  the  PAC  structure  of  the  compound  interactive  object  represented  as  a 
black  circle.  This  object  Is  built  from  two  elementary  PAC  objects  and  one  compound 
object  which,  In  turn,  is  composed  of  two  elementary  PAC  objects. 

In  addition,  the  user  Interface  of  a  workstation  (generally  refered  to  as  a  shell)  may  be 
modelled  In  a  straighforward  manner  by  adding  an  extra  PAC  layer  on  top  of  the 
application  level.  The  Abstract  part  of  that  layer  may  include  such  global  data  structures 
as  the  "clipping  board”  or  the  "network  status."  The  Presentation  would  present  these 
data  structures  and  allow  for  the  Initial  Invocation  of  applications.  Finally,  the  Control 
part  would,  of  course,  bridge  the  gap  between  the  abstract  and  the  concrete  sides.  It 
would  as  well  supervise  the  control  parts  of  all  of  the  active  applications.  Such  an 
arbitrator  should  provide  the  basis  for  a  uniform  mechanism  for  transferring  data 
between  applications. 
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Figure  6.5;  The  design  model. 


This  recursive  object-oriented  organization  presents  some  advantages  which  are 
described  in  the  following  paragraph. 

6.1. 5.5.  The  Interest  Aspects  of  the  PAC  Model 
The  PAC  model  has  three  interesting  aspects: 
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1.  It  defines  a  consistent  framework  for  the  construction  of  user  Interfaces 
that  Is  applicable  at  any  level  of  abstraction.  As  a  direct  consequence, 
the  units  of  exchange  between  the  application  (i.e.,  the  Abstract  part) 
and  the  UIMS  (i.e.,  the  PAC  controller)  are  application  concepts,  not  low- 
level  details  semantically  Irrelevant  to  the  application. 

2.  it  cleanly  distinguishes  functional  notions  from  presentation  policies  and 
Introduces  the  control  part  to  bridge  the  gap  between  the  abstract  and 
the  concrete  worlds.  The  role  of  the  control  part  may  be  extended  from 
consistency  maintenance  between  the  two  worlds,  to  the  management 
of  local  contextual  information  that  may  be  useful  for  help,  error 
explanation  and  automatic  adaptation  to  the  user. 

3.  It  takes  full  advantage  of  the  object-oriented  paradigm  with  the  notion  of 
interactive  object. 

An  interactive  object  is  an  active  entity.  It  evolves,  communicates  and  maintains 
relationships  with  other  objects.  Such  activity,  parallelism  and  communication  are 
automatically  performed  by  the  Object  Machine,  the  generic  class  of  the  interactive 
objects.  The  Object  Machine  defines  the  general  functioning  that  is  made  common  to 
ail  of  the  interactive  objects  by  means  of  the  Inheritance  mechanism,  in  particular,  each 
object  owns  a  private  finite  state  automaton  for  maintaining  its  current  dialogue  state. 
On  receipt  of  a  message,  an  object  is  thus  able  to  determine  which  actions  to  undertake 
according  to  its  current  state.  In  particular.  The  PAC  controller  at  the  top  of  the 
hierarchy  of  controllers,  maintains  the  global  state  of  the  dialogue  with  the  application. 

Interactive  objects  implement  the  dialogue  in  a  distributed  way.  This  feature  can  serve 
as  a  basis  for  the  Implementlon  of  facilities  related  to  the  notion  of  context.  It  also 
provides  the  necessary  grounds  for  concurrent  multiple  I/O  in  the  following  way.  The 
set  of  automata  (one  automaton  per  Interactive  object)  defines  the  global  state  of  the 
Interaction  between  the  user  and  the  application.  The  control  of  the  interaction  is 
therefore  distributed  in  an  evolutive  network  of  interactive  objects.  Dialogue  control  is 
not  handled  by  a  unique  monolithic  dialogue  manager  difficult  to  maintain,  extend  and 
implement,  in  particular  when  one  wants  a  pure  user-driven  style  of  interaction. 
Conversely,  since  interactive  objects  are  able  to  maintain  their  own  state,  it  is  easy  to  let 
the  user  switch  between  objects  in  any  order.  Thus,  an  object-oriented  approach 
provides  for  free  the  maintenance  of  the  user's  arbitrary  manipulations. 

interactive  objects  are  easily  customizable.  Object-oriented  programming  languages 
support  data  abstraction  which  makes  it  possible  to  change  underlying 
impiementations  without  changing  the  calling  programs,  in  the  present  case,  this 
principle  allows  the  internal  modification  of  an  interactive  object  without  changing  its 
presentation  and  abstract  interfaces,  interestingly,  it  also  allows  the  modification  of  one 
interface  without  any  side-effect  on  the  other  interface.  For  example,  one  can  modify 
the  presentation  of  an  Interactive  object  (such  as  attaching  a  different  key  translation 
table  to  an  interactive  object  of  type  string)  without  reflecting  on  its  abstract  behaviour. 
This  property  makes  possible  fine  grained  dynamic  adjustments  of  the  user  interface 
without  massive  modifications  to  the  presentation  of  the  whole  application. 
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6.1.6.  Multiple  Views  of  Data 

One  problem  associated  with  the  separation  of  the  user  interface  from  the  functional 
core  of  the  application  Is  the  management  of  multiple  presentations  of  the  same 
application  data  item.  Since  the  appiication  is  written  to  be  media  independent  it  has 
no  knowiedge  of  any  presentation  issues,  in  particuiar,  how  many  times  a  particuiar 
piece  of  its  data  is  presented  to  the  user  and  in  what  forms. 

For  example,  suppose  the  pressure  within  a  pipe  Is  represented  both  by  the  color  of  the 
fluid  in  the  pipe  and  by  a  separate  pressure  gauge.  When  the  pressure  changes  both 
presentations  shouid  change.  Managing  these  muitipie  views  of  the  same  data  item  is 
the  responsibiiity  of  the  runtime  kemei.  The  kemei  must  have  a  mechanism  to 
determine  which  data  Items  determine  the  nature  of  a  particuiar  presentation. 
Otherwise,  the  kemei  cannot  automatically  manage  the  presentation.  This  mechanism 
must  aiiow  the  determination  that  two  different  presentations  depend  upon  the  same 
data  item. 

The  determination  that  two  different  presentations  depend  upon  the  same  data  item 
depends  upon  the  interface  between  the  functional  core  and  the  runtime  kemei  and  the 
Information  presented  to  the  runtime  kemei.  in  Serpent,  for  example,  two  presentations 
are  determined  to  depend  upon  the  same  data  item  if  they  both  depend  upon  a 
particuiar  eiement  in  the  data  base  schema  which  describes  the  data.  This  allows  the 
automatic  modification  of  an  aggregate  in  the  presentation  when  a  component  changes 
if  the  runtime  interface  is  in  charge  of  maintaining  the  aggregate,  it  does  not  aiiow  the 
automatic  modification  of  the  aggregate  if  the  ap^ication  is  in  charge  of  maintaining  the 
aggregate. 


6.1.7.  Feedback 

One  of  the  most  troublesome  issues  associated  with  the  separation  of  the  functional 
portion  of  the  appiication  from  the  user  interface  Is  that  of  feedback  [Hudson  88]. 

Feedback  is  the  displaying  to  the  user  some  Indication  of  the  system's  understanding  of 
the  actions  being  performed.  For  exampie,  in  the  X  tooikit,  a  widget  wiii  reverse  video 
when  the  cursor  is  within  the  widget,  it  is  possibie  to  change  cursor  shape  when  the 
cursor  goes  from  one  window  to  another.  These  are  examples  of  lexical  feedback  and 
are  handled  at  the  presentation  level. 

Another  type  of  feedback  comes  from  the  runtime  kernel.  On  the  Macintosh,  certain 
options  within  a  menu  are  dispiayed  in  gray  scaie  to  indicate  that  they  are  not  currently 
available.  The  runtime  kernel  knows  the  current  context  of  the  action  and  makes  the 
decision  to  display  certain  items  in  a  fashion  that  gives  feedback  to  the  end  user  about 
the  current  state  of  that  item.  This  is  an  example  of  syntactic  feedback  (based  on  the 
current  context). 

A  deeper  level  of  feedback  might  be  changing  the  color  of  a  beam  in  a  CAD/CAM 
application  to  represent  the  stress  currently  being  placed  on  that  beam.  This  is  an 
example  of  semantic  feedback  since  the  determination  of  the  current  coior  depends 
upon  knowiedge  that  only  the  functional  core  of  the  application  maintains. 
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These  three  types  of  feedback  represent  different  levels  of  abstraction  and  should  be 
performed  in  separate  portions  of  the  software.  This  implies  that  the  software  structure 
must  be  available  to  allow  that  separation.  The  hierarchical  decomposition  of  PAC  is 
explicitly  designed  to  allow  the  separation  of  various  levels  of  feedback. 

The  reason  that  feedback  is  a  troubling  issue  is  because  of  the  performance 
implications.  Feedback,  by  its  nature,  should  be  fast.  The  end  user  should,  ideally,  be 
given  Indications  of  the  meaning  of  an  action  when  that  action  is  occurring.  It  is  not 
clear  that  this  is  always  possible  in  the  case  of  deep  semantic  feedback  and  the 
architectural  structure  of  a  system  may  not  always  support  both  the  performance 
requirements  of  rapid  feedback  and  the  separation  of  the  functional  core  of  the 
application  from  the  user  interface.  In  any  case,  the  human  processing  model  gives  a 
bound  on  required  functionality.  Since  events  occurring  in  less  than  0.1  second  are 
seen  to  be  Instanteous,  feedback  performance  requirements  will  be  satisfied  if  they  can 
be  met  within  that  time  period. 


6.2.  User  Interface  Environments 


6.2.1.  Introduction 

The  actions  of  the  runtime  kernel  are  determined  by  a  language  used  to  describe  the 
dialogue.  The  mechanism  for  specification  of  that  language  plays  a  large  part  in 
acceptability  of  the  user. Interface  runtime  system.  One  possibility,  which  won't  be 
further  discussed,  is  to  use  a  standard  programming  language  to  interact  directly  with 
the  runtime  kernel.  MacApp  [Schmucker  86],  APEX  [Coutaz  87a]  and  EZWIn 
[Liebermann  85]  are  examples.  The  approach  Is  to  treat  the  runtime  kernel  as  an 
extension  of  a  toolkit. 

More  interesting  are  cases  where  specialized  language  or  specification  mechanisms 
exist.  The  examples  to  be  discussed  are: 

1.  Textual  language  specification 

2.  Graphical  editor  specification 

3.  Complete  environments 

Figure  6.6  represents  the  usage  of  the  specification.  The  dialogue  specifier  creates  a 
dialogue  using  some  tool  and  the  created  specification  provides  the  mechanism  for  the 
runtime  kernel  to  operate.  The  specification  can  be  distinct  in  time  from  the  execution  of 
the  runtime  kernel  or  specification  time  and  runtime  can  be  intertwined.  The  textual 
language  specification  which  is  discussed  first  is,  inherently,  distinct  in  time  from 
runtime. 
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6.2.2.  Textual  language  specification 

Domain  [Schulert  85]  is  a  commerciai  user  interface  nanagement  system  avaiiabie 
from  Apoiio.  The  modei  that  Domain  uses  is  given  in  Figure  6.7.  This  is  aiso  the  modei 
used  in  Cousin  [Hayes  83].  The  interface  between  the  domain  dependent  portion  of  the 
program  and  the  user  interface  is  defined  to  be  a  group  of  Taste".  Each  task  has  a 
computation  portion.  The  user  interface  is  defined  in  terms  of  buiiding  biocte  which 
define  the  presentation  in  terms  of  the  taste.  The  appiication  piaces  vaiues  in  the  task 
which  cause  the  presentation  to  change  and  the  buiiding  biocte  piace  vaiues  in  the 
taste  which  affect  the  appiication.  Figure  6.8  shows  the  user  interface  for  a  simpie 
exampie.  Figure  6.9  gives  the  taste,  Rgure  6.10  gives  the  buiiding  biocte  and  Figure 
6.1 1  gives  the  appiication  code  for  this  exampie. 
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This  program  determines  if  an  integer  is 
even  or  odd.  Position  the  cursor  vith 
the  mouse  (left  button).  Then  type  a 
number  between  0  and  20,  and  <RETURN>. 


Rgure  6.8;  The  user  Interface  of  a  simple  example. 
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example 


exit-task:=NULL: 

COMP  =>  <CALL  odd-or-even> 

MIN=0; 

MAX=20 

END 

true-false-task;=BOOL: 

COMP  =>o 
END 

message-task:=MSG: 

VALUE = 

"This  program  determines  if  an  integer  is  even  or  odd." 
&"Position  the  cursor  with  the  mouse  (left  button)." 

&"Then  type  a  number  between  0  and  20,  aixl  <RETURN>." 

END 


Rgure  5.9:  The  tasks  for  the  example  of  Figure  5.8. 
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USER-IWTERFA®  example 

exit;=ICON: 

TASK  =  exit-task; 

BACKGROUND  =  GREY; 

SHAPE  =  ROUNDED; 

SIZE  =  (100  350)  PIXELS; 

STRING  =  "exit" 

END 

number=INT_FIELD: 

TASK  =  number-task; 

BACKGROUND  =  OFF 
SHAPE  =  ROUNDED 

HELP-TEXT  =  "you  must  give  an  integer  from  0  to  20" 

END 

true-false:=BOOL-FIELD 
TASK  =  tnie-false-task; 

BACKGROUND  =  OFF; 

SHAPE  =  ROUNDED; 

HELP  -TEXT  =  "true=even  number"  &  "false  =  odd  number" 
END 

row-bottom;=ROW 

BACKGROUND  =  ON; 

ORIENTATION  =  HORIZONTAL; 

BORDER-WIDTH  =  10;  DIVISION  -WIDTH  =  5; 

OUTLINE  =  ON;  SHAPE  =  ROUNDED; 

CONTENTS  =  (exit  number  true-false) 

END 

message:=  DISPLAY  TEXT 
TASK  =  message-task; 

SHAPE  =  ROUNDED 
END 

row-all:=  ROW 

BACKGROUND  =  ON; 

ORIENTATION  =  VERTICAL; 

BORDER-WIDTH  =  10;  DIVISION-WIDTH  =  5; 

OUTLINE  =  ON;  SHAPE  =  ROUNDED; 

CONTENTS  =  (row-bottom  message) 

END 

std-window: 

CONTENTS  =  row-all 
END 


Rgure  6.10:  The  building  blocks  for  the  user  interface  for  the 
example  in  Figure  6.8. 
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MAIN  PROGRAM 

-  initiate  DIALOG 

-  set  initial  values  and  defatilts  to  tasks 

dp-$bool*set*value(tnje*false-task,tnie,status); 

-  activate  a  task  or  a  group  of  tasks 

dp-$task-activate  (dp-$aU-task-group,...); 

-  wait  for  an  input  event 

dp-$event-wait  ( . ); 

-  exit  Dialogue 

dp-$tenninate  (...); 

A  MODULE:  the  {Moceduie  whidi  checks  the  parity 

odd-or-evenO 

int  value-int,  value-bool; 
begin 

-  get  ii^t  data 

dp-$int-get-value  (number-task,  value-int,  status); 
-(*eck  parity 
if  ((value-int/2)  =  0)  then 
value-bool  =  true 
else 

value-bool  =  false; 

-  send  the  result  to  the  task 

dp-$bool-set-value  (tiue-false-task,  value-bool,  status); 


Rgure  6.1 1 :  The  application  code  for  the  example  of  Figure  6.7. 


6.2.3.  Graphical  Editor  Specification 

Since  so  much  of  the  user  interface  is  graphical  in  nature,  it  makes  sense  to  have 
editors  which  are  used  to  specify  the  graphical  portion  of  the  interface.  Such  editors 
have  been  created  such  as  Menuiay  [Buxton  83].  The  editors  become  iayout  editors. 
That  is,  the  graphical  editors  are  used  to  specify  the  appearance  of  a  display  and  where 
on  the  display  various  presentation  objects  will  reside.  Once  the  layout  has  been 
spedfled  then  the  connections  between  the  presentation  objects  and  the  dialogue 
control  are  established.  One  problem  with  the  usage  of  such  editors  is  how  to 
represent  the  dependencies  upon  application  data.  This  Issue  goes  to  the  heart  of  the 
timing  distinction  between  specification  time  and  runtime. 

6.2.3.I.  Realization 

The  dialogue  gives  a  mapping  between  application  objects  and  presentation  objects. 
Implicit  In  this  mapping  Is  a  dependency  of  certain  attributes  of  the  presentation  object 
upon  application  values.  If  there  were  no  such  dependencies  then  the  presentation 
would  be  totally  Independent  of  the  application.  When  the  display  is  presented  to  the 
spedfier  It  must  be  realized  with  some  set  of  application  values.  In  order  to  be  totally 
realistic,  the  values  should  be  generated  by  the  application  and,  hence,  runtime  and 
specification  time  are  the  same,  in  some  systems  (e.g.  Serpent),  the  specifier  provides 
fixed  values  for  the  attributes  of  the  presentation  objects  which  depend  upon 
application  objects.  These  fixed  values  then  show  the  specifier  one  possible  display. 
The  problem  of  how  to  realize  the  interface  leads  into  the  idea  of  having  a  total 
environment  for  the  development  of  user  interface.  Before  discussing  that  issue. 
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however,  some  of  the  power  possible  with  having  a  separate  tool  to  construct  the  editor 
will  be  shown. 

6.2.3.2.  Smart  Editors 

When  an  interface  is  being  constructed,  typically  there  is  a  particular  style  being  used 
for  some  of  the  components  such  as  menus.  Peridot  [Myers  87]  is  a  system  that  uses 
expert  system  techniques  to  make  inferences  about  what  sfyie  is  being  used  for 
particular  components.  For  example,  the  specifier  would  completely  construct  one 
menu  and  then  whenever  another  menu  was  being  constructed.  Peridot  would  propose 
that  it  have  the  same  style  as  the  previous  menu.  This  is  one  example  of  the  type  of 
Intelligence  that  could  be  put  into  separate  dialogue  construction  tools. 


6.2.4.  Environment 

Although  integrated  user  interface  development  and  execution  environments  are 
desirable,  they  have  not  yet  been  produced.  One  system  that  comes  close  to  an 
Integrated  environment  is  HyperCard  [Harvey  88].  HyperCard  is  a  system  that  manages 
textual  and  graphical  objects  In  a  multidimensional  fashion.  Each  task  that  is  to  be 
accomplished  is  represented  by  a  stack  of  cards.  Cards  within  a  stack  can  be  linked  to 
other  stacks  to  represent  associations  that  the  specifier  wishes  to  maintain.  Cards  can 
be  searched  to  locate  those  that  have  information  of  relevance  to  the  implementor. 

HyperCard  integrates  the  specification  and  the  runtime  by  allowing  scripts  tp  be 
developed  while  data  resides  In  the  stacks.  These  scripts  can  then  be  executed,  the 
results  displayed  and  the  scripts  modified.  This  Interaction  between  specification  and 
execution  allows  the  development  of  applications  in  a  very  smooth  and  continuous 
fashion. 

Within  HyperCard,  the  distinction  between  the  application  functional  core  and  the  user 
Interface  is  biurred.  This  makes  difficult  the  clear  separation  of  functionality,  which  is 
the  basis  of  the  UIMS. 


6.2.5.  State  of  the  Art 

Within  the  field  of  user  interfaces,  today,  we  know  how  to  do  things  which  are 
application  independent.  Menus,  scroll  bars,  etc  are  methods  of  allowing  for  user  Input 
with  low-level  feedback  which  have  proven  very  valuable.  What  is  not  known  Is  how  to 
do  things  which  are  application  dependent.  Semantic  feedback  (feedback  depending 
upon  application  semantics)  is  not  well  understood  and  current  tools  do  a  poor  job  of 
supporting  this  type  of  feedback  while  stiii  providing  a  dear  separation  of  functionality. 
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